Sugar and lipid metabolism regulators in plants iii

ABSTRACT

Isolated nucleic acids and proteins associated with lipid and sugar metabolism regulation are provided. In particular, lipid metabolism proteins (LMP) and encoding nucleic acids originating from  Arabidopsis thaliana  are provided. The nucleic acids and proteins are used in methods of producing transgenic plants and modulating levels of seed storage compounds. Preferably, the seed storage compounds are lipids, fatty acids, starches or seed storage proteins.

RELATED APPLICATIONS

This application is a divisional of U.S. patent application Ser. No.11/520,850, filed Sep. 13, 2006, which is a divisional of U.S. patentapplication Ser. No. 10/217,939, filed Aug. 12, 2002, now U.S. Pat. No.7,135,618, which claims benefit to U.S. Provisional Application No.60/311,414 filed Aug. 10, 2001. The entire contents of each of theseapplications are hereby incorporated by reference herein.

SUBMISSION OF SEQUENCE LISTING

The Sequence Listing associated with this application is filed inelectronic format via EFS-Web and hereby incorporated by reference intothe specification in its entirety. The name of the text file containingthe Sequence Listing is Sequence_Listing_(—)12810_(—)00965_US. The sizeof the text file is 102 KB, and the text file was created on Dec. 4,2009.

FIELD OF THE INVENTION

This invention relates generally to nucleic acid sequences encodingproteins that are related to the presence of seed storage compounds inplants. More specifically, the present invention relates to nucleic acidsequences encoding sugar and lipid metabolism regulator proteins and theuse of these sequences in transgenic plants. The invention furtherrelates to methods of applying these novel plant polypeptides to theidentification and stimulation of plant growth and/or to the increase ofyield of seed storage compounds.

BACKGROUND

The study and genetic manipulation of plants has a long history thatbegan even before the famed studies of Gregor Mendel. In perfecting thisscience, scientists have accomplished modification of particular traitsin plants ranging from potato tubers having increased starch content tooilseed plants such as canola and sunflower having increased or alteredfatty acid content. With the increased consumption and use of plantoils, the modification of seed oil content and seed oil levels hasbecome increasingly widespread (e.g. Töpfer et al. 1995, Science268:681-686). Manipulation of biosynthetic pathways in transgenic plantsprovides a number of opportunities for molecular biologists and plantbiochemists to affect plant metabolism giving rise to the production ofspecific higher-value products. The seed oil production or compositionhas been altered in numerous traditional oilseed plants such as soybean(U.S. Pat. No. 5,955,650), canola (U.S. Pat. No. 5,955,650), sunflower(U.S. Pat. No. 6,084,164) and rapeseed (Töpfer et al. 1995, Science268:681-686), and non-traditional oil seed plants such as tobacco(Cahoon et al. 1992, Proc. Natl. Acad. Sci. USA 89:11184-11188).

Plant seed oils comprise both neutral and polar lipids (see Table 1).The neutral lipids contain primarily triacylglycerol, which is the mainstorage lipid that accumulates in oil bodies in seeds. The polar lipidsare mainly found in the various membranes of the seed cells, e.g. theendoplasmic reticulum, microsomal membranes and the cell membrane. Theneutral and polar lipids contain several common fatty acids (see Table2) and a range of less common fatty acids. The fatty acid composition ofmembrane lipids is highly regulated and only a select number of fattyacids are found in membrane lipids. On the other hand, a large number ofunusual fatty acids can be incorporated into the neutral storage lipidsin seeds of many plant species (Van de Loo F. J. et al. 1993, UnusualFatty Acids in Lipid Metabolism in Plants pp. 91-126, editor TS MooreJr. CRC Press; Millar et al. 2000, Trends Plant Sci. 5:95-101).

TABLE 1 Plant Lipid Classes Neutral Lipids Triacylglycerol (TAG)Diacylglycerol (DAG) Monoacylglycerol (MAG) Polar LipidsMonogalactosyldiacylglycerol (MGDG) Digalactosyldiacylglycerol (DGDG)Phosphatidylglycerol (PG) Phosphatidylcholine (PC)Phosphatidylethanolamine (PE) Phosphatidylinositol (PI)Phosphatidylserine (PS) Sulfoquinovosyldiacylglycerol

TABLE 2 Common Plant Fatty Acids 16:0 Palmitic acid 16:1 Palmitoleicacid 16:3 Palmitolenic acid 18:0 Stearic acid 18:1 Oleic acid 18:2Linoleic acid 18:3 Linolenic acid γ-18:3   Gamma-linolenic acid* 20:0Arachidic acid 22:6 Docosahexanoic acid (DHA)* 20:2 Eicosadienoic acid20:4 Arachidonic acid (AA)* 20:5 Eicosapentaenoic acid (EPA)* 22:1Erucic acid

These fatty acids do not normally occur in plant seed oils, but theirproduction in transgenic plant seed oil is of importance in plantbiotechnology.

Lipids are synthesized from fatty acids and their synthesis may bedivided into two parts: the prokaryotic pathway and the eukaryoticpathway (Browse et al. 1986, Biochemical J. 235:25-31; Ohlrogge & Browse1995, Plant Cell 7:957-970). The prokaryotic pathway is located inplastids that are the primary site of fatty acid biosynthesis. Fattyacid synthesis begins with the conversion of acetyl-CoA to malonyl-CoAby acetyl-CoA carboxylase (ACCase). Malonyl-CoA is converted tomalonyl-acyl carrier protein (ACP) by the malonyl-CoA:ACP transacylase.The enzyme beta-keto-acyl-ACP-synthase III (KAS III) catalyzes acondensation reaction in which the acyl group from acetyl-CoA istransferred to malonyl-ACP to form 3-ketobutyryl-ACP. In a subsequentseries of condensation, reduction and dehydration reactions the nascentfatty acid chain on the ACP cofactor is elongated by the step-by-stepaddition (condensation) of two carbon atoms donated by malonyl-ACP untila 16- or 18-carbon saturated fatty acid chain is formed. The plastidialdelta-9 acyl-ACP desaturase introduces the first unsaturated double bondinto the fatty acid. Thioesterases cleave the fatty acids from the ACPcofactor and free fatty acids are exported to the cytoplasm where theyparticipate as fatty acyl-CoA esters in the eukaryotic pathway. In thispathway the fatty acids are esterified by glycerol-3-phosphateacyltransferase and lysophosphatidic acid acyltransferase to the sn-1and sn-2 positions of glycerol-3-phosphate, respectively, to yieldphosphatidic acid (PA). The PA is the precursor for other polar andneutral lipids, the latter being formed in the Kennedy pathway (Voelker1996, Genetic Engineering ed.: Setlow 18:111-113; Shanklin & Cahoon1998, Annu. Rev. Plant Physiol. Plant Mol. Biol. 49:611-641; Frentzen1998, Lipids 100:161-166; Millar et al. 2000, Trends Plant Set.5:95-101).

Storage lipids in seeds are synthesized from carbohydrate-derivedprecursors. Plants have a complete glycolytic pathway in the cytosol(Plaxton 1996, Annu. Rev. Plant Physiol. Plant Mol. Biol. 47:185-214)and it has been shown that a complete pathway also exists in theplastids of rapeseeds (Kang & Rawsthorne 1994, Plant J. 6:795-805).Sucrose is the primary source of carbon and energy, transported from theleaves into the developing seeds. During the storage phase of seeds,sucrose is converted in the cytosol to provide the metabolic precursorsglucose-6-phosphate and pyruvate. These are transported into theplastids and converted into acetyl-CoA that serves as the primaryprecursor for the synthesis of fatty acids. Acetyl-CoA in the plastidsis the central precursor for lipid biosynthesis. Acetyl-CoA can beformed in the plastids by different reactions and the exact contributionof each reaction is still being debated (Ohlrogge & Browse 1995, PlantCell 7:957-970). It is however accepted that a large part of theacetyl-CoA is derived from glucose-6-phospate and pyruvate that areimported from the cytoplasm into the plastids. Sucrose is produced inthe source organs (leaves, or anywhere that photosynthesis occurs) andis transported to the developing seeds that are also termed sink organs.In the developing seeds, the sucrose is the precursor for all thestorage compounds, i.e. starch, lipids and partly the seed storageproteins. Therefore, it is clear that carbohydrate metabolism in whichsucrose plays a central role is very important to the accumulation ofseed storage compounds.

Although lipid and fatty acid content of seed oil can be modified by thetraditional methods of plant breeding, the advent of recombinant DNAtechnology has allowed for easier manipulation of the seed oil contentof a plant, and in some cases, has allowed for the alteration of seedoils in ways that could not be accomplished by breeding alone (see,e.g., Töpfer et al. 1995, Science 268:681-686). For example,introduction of a Δ¹²-hydroxylase nucleic acid sequence into transgenictobacco resulted in the introduction of a novel fatty acid, ricinoleicacid, into the tobacco seed oil (Van de Loo et al. 1995, Proc. Natl.Acad. Sci USA 92:6743-6747). Tobacco plants have also been engineered toproduce low levels of petroselinic acid by the introduction andexpression of an acyl-ACP desaturase from coriander (Cahoon et al. 1992,Proc. Natl. Acad. Sci USA 89:11184-11188).

The modification of seed oil content in plants has significant medical,nutritional and economic ramifications. With regard to the medicalramifications, the long chain fatty acids (C18 and longer) found in manyseed oils have been linked to reductions in hypercholesterolemia andother clinical disorders related to coronary heart disease (Brenner1976, Adv. Exp. Med. Biol. 83:85-101). Therefore, consumption of a planthaving increased levels of these types of fatty acids may reduce therisk of heart disease. Enhanced levels of seed oil content also increaselarge-scale production of seed oils and thereby reduce the cost of theseoils.

In order to increase or alter the levels of compounds such as seed oilsin plants, nucleic acid sequences and proteins regulating lipid andfatty acid metabolism must be identified. As mentioned earlier, severaldesaturase nucleic acids such as the Δ⁶-desaturase nucleic acid,Δ¹²-desaturase nucleic acid and acyl-ACP desaturase nucleic acid havebeen cloned and demonstrated to encode enzymes required for fatty acidsynthesis in various plant species. Oleosin nucleic acid sequences fromsuch different species as Brassica, soybean, carrot, pine andArabidopsis thaliana have also been cloned and determined to encodeproteins associated with the phospholipid monolayer membrane of oilbodies in those plants.

It has also been determined that two phytohormones, gibberellic acid(GA) and absisic acid (ABA), are involved in overall regulatoryprocesses in seed development (e.g. Ritchie & Gilroy 1998, PlantPhysiol. 116:765-776; Arenas-Huertero et al. 2000, Genes Dev.14:2085-2096). Both the GA and ABA pathways are affected by okadaicacid, a protein phosphatase inhibitor (Kuo et al. 1996, Plant Cell.8:259-269). The regulation of protein phosphorylation by kinases andphosphatases is accepted as a universal mechanism of cellular control(Cohen 1992, Trends Biochem. Sci. 17:408-413. Likewise, the planthormones ethylene (e.g. Zhou et al. 1998, Proc. Natl. Acad. Sci. USA95:10294-10299; Beaudoin et al. 2000, Plant Cell 2000:1103-1115) andauxin (e.g. Colon-Carmona et al. 2000, Plant Physiol. 124:1728-1738) areinvolved in controlling plant development as well.

Although several compounds are known that generally affect plant andseed development, there is a clear need to specifically identify factorsthat are more specific for the developmental regulation of storagecompound accumulation and to identify genes which have the capacity toconfer altered or increased oil production to its host plant and toother plant species. This invention discloses a large number of nucleicacid sequences from Arabidopsis thaliana. These nucleic acid sequencescan be used to alter or increase the levels of seed storage compoundssuch as proteins, sugars and oils, in plants, including transgenicplants, such as rapeseed, canola, linseed, soybean, sunflower maize,oat, rye, barley, wheat, pepper, tagetes, cotton, oil palm, coconutpalm, flax, castor and peanut, which are oilseed plants containing highamounts of lipid compounds.

SUMMARY OF THE INVENTION

The present invention provides novel isolated nucleic acid and aminoacid sequences associated with the metabolism of seed storage compoundsin plants.

The present invention also provides an isolated nucleic acid fromArabidopsis encoding a Lipid Metabolism Protein (LMP), or a portionthereof. These sequences may be used to modify or increase lipids andfatty acids, cofactors and enzymes in microorganisms and plants.

Arabidopsis plants are known to produce considerable amounts of fattyacids like linoleic and linolenic acid (see, e.g., Table 2) and fortheir close similarity in many aspects (gene homology etc.) to the oilcrop plant Brassica. Therefore nucleic acid molecules originating from aplant like Arabidopsis thaliana are especially suited to modify thelipid and fatty acid metabolism in a host, especially in microorganismsand plants. Furthermore, nucleic acids from the plant Arabidopsisthaliana can be used to identify those DNA sequences and enzymes inother species which are useful to modify the biosynthesis of precursormolecules of fatty acids in the respective organisms.

The present invention further provides an isolated nucleic acidcomprising a fragment of at least 15 nucleotides of a nucleic acid froma plant (Arabidopsis thaliana) encoding a Lipid Metabolism Protein(LMP), or a portion thereof.

Also provided by the present invention are polypeptides encoded by thenucleic acids, and heterologous polypeptides comprising polypeptidesencoded by the nucleic acids, and antibodies to those polypeptides.

Additionally, the present invention relates to and provides the use ofLMP nucleic acids in the production of transgenic plants having amodified level of a seed storage compound. A method of producing atransgenic plant with a modified level of a seed storage compoundincludes the steps of transforming a plant cell with an expressionvector comprising a LMP nucleic acid, and generating a plant with amodified level of the seed storage compound from the plant cell. In apreferred embodiment, the plant is an oil producing species selectedfrom the group consisting of rapeseed, canola, linseed, soybean,sunflower, maize, oat, rye, barley, wheat, pepper, tagetes, cotton, oilpalm, coconut palm, flax, castor and peanut, for example.

According to the present invention, the compositions and methodsdescribed herein can be used to increase or decrease the level of a LMPin a transgenic plant comprising increasing or decreasing the expressionof a LMP nucleic acid in the plant. Increased or decreased expression ofthe LMP nucleic acid can be achieved through in vivo mutagenesis of theLMP nucleic acid. The present invention can also be used to increase ordecrease the level of a lipid in a seed oil, to increase or decrease thelevel of a fatty acid in a seed oil, or to increase or decrease thelevel of a starch in a seed or plant.

Also included herein is a seed produced by a transgenic planttransformed by a LMP DNA sequence, wherein the seed contains the LMP DNAsequence and wherein the plant is true breeding for a modified level ofa seed storage compound. The present invention additionally includes aseed oil produced by the aforementioned seed.

Further provided by the present invention are vectors comprising thenucleic acids, host cells containing the vectors, and descendent plantmaterials produced by transforming a plant cell with the nucleic acidsand/or vectors.

According to the present invention, the compounds, compositions, andmethods described herein can be used to increase or decrease the levelof a lipid in a seed oil, or to increase or decrease the level of afatty acid in a seed oil, or to increase or decrease the level of astarch or other carbohydrate in a seed or plant. A method of producing ahigher or lower than normal or typical level of storage compound in atransgenic plant, comprises expressing a LMP nucleic acid fromArabidopsis thaliana in the transgenic plant, wherein the transgenicplant is Arabidopsis thaliana or a species different from Arabidopsisthaliana. Also included herein are compositions and methods of themodification of the efficiency of production of a seed storage compound.

Accordingly, it is an object of the present invention to provide novelisolated LMP nucleic acids and isolated LMP amino acid sequences fromArabidopsis thaliana, as well as active fragments, analogs and orthologsthereof.

It is another object of the present invention to provide transgenicplants having modified levels of seed storage compounds, and inparticular, modified levels of a lipid, a fatty acid or a sugar.

It is a further object of the present invention to provide methods forproducing such aforementioned transgenic plants.

It is another object of the present invention to provide seeds and seedoils from such aforementioned transgenic plants.

These and other objects, features and advantages of the presentinvention will become apparent after a review of the following detaileddescription of the disclosed embodiments and the appended claims.

BRIEF DESCRIPTION OF THE FIGURES

FIGS. 1A-B: FIG. 1A shows the polynucleotide sequences of the openreading frame of Clone ID NO: AT004002024 from Arabidopsis thaliana (SEQID NO:1) of the present invention. The polynucleotide sequence contains648 nucleotides. FIG. 1B shows the deduced amino acid sequence of SEQ IDNO:1 (SEQ ID NO:2) (Clone ID NO: AT004002024) of the present invention.The polypeptide sequence contains 216 amino acids. The standardone-letter abbreviation for amino acids is used to illustrate thededuced amino acid sequence.

FIG. 2A-B: FIG. 2A shows the polynucleotide sequences of the openreading frame of Clone ID NO: AT004004054 from Arabidopsis thaliana (SEQID NO:3) of the present invention. The polynucleotide sequence contains720 nucleotides. FIG. 2B shows the deduced amino acid sequence of SEQ IDNO:3 (SEQ ID NO:4) (Clone ID NO: AT004004054) of the present invention.The polypeptide sequence contains 240 amino acids. The standardone-letter abbreviation for amino acids is used to illustrate thededuced amino acid sequence.

FIGS. 3A-B: FIG. 3A shows the polynucleotide sequences of the openreading frame of Clone ID NO: AT004005069 from Arabidopsis thaliana (SEQID NO:5) of the present invention. The polynucleotide sequence contains1995 nucleotides. FIG. 3B shows the deduced amino acid sequence of SEQID NO:5 (SEQ ID NO:6) (Clone ID NO: AT004005069) of the presentinvention. The polypeptide sequence contains 665 amino acids. Thestandard one-letter abbreviation for amino acids is used to illustratethe deduced amino acid sequence.

FIGS. 4A-B: FIG. 4A shows the polynucleotide sequences of the openreading frame of Clone ID NO: AT004009021 from Arabidopsis thaliana (SEQID NO:7) of the present invention. The polynucleotide sequence contains1200 nucleotides. FIG. 4B shows the deduced amino acid sequence of SEQID NO:7 (SEQ ID NO:8) (Clone ID NO: AT004009021) of the presentinvention. The polypeptide sequence contains 400 amino acids. Thestandard one-letter abbreviation for amino acids is used to illustratethe deduced amino acid sequence.

FIGS. 5A-D: FIG. 5A shows the polynucleotide sequences of the openreading frame of Clone ID NO: pk109 from Arabidopsis thaliana (SEQ IDNO:9) of the present invention. The polynucleotide sequence contains1173 nucleotides. FIG. 5B shows the deduced amino acid sequence of SEQID NO:9 (SEQ ID NO:10) (Clone ID NO: pk109) of the present invention.The polypeptide sequence contains 391 amino acids. FIG. 5C shows thepolynucleotide sequences of the open reading frame of Clone ID NO:pk109-1 from Arabidopsis thaliana (SEQ ID NO:11) of the presentinvention. The polynucleotide sequence contains 843 nucleotides. FIG. 5Dshows the deduced amino acid sequence of SEQ ID NO:11 (SEQ ID NO:12)(Clone ID NO: pk109-1) of the present invention. The polypeptidesequence contains 281 amino acids. The standard one-letter abbreviationfor amino acids is used to illustrate the deduced amino acid sequence.

FIGS. 6A-B: FIG. 6A shows the polynucleotide sequences of the openreading frame of Clone ID NO: pk110 from Arabidopsis thaliana (SEQ IDNO:13) of the present invention. The polynucleotide sequence contains2013 nucleotides. FIG. 6B shows the deduced amino acid sequence of SEQID NO:13 (SEQ ID NO:14) (Clone ID NO: pk110) of the present invention.The polypeptide sequence contains 671 amino acids. The standardone-letter abbreviation for amino acids is used to illustrate thededuced amino acid sequence.

FIGS. 7A-D: FIG. 7A shows the polynucleotide sequences of the openreading frame of Clone ID NO: pk111 from Arabidopsis thaliana (SEQ IDNO:15) of the present invention. The polynucleotide sequence contains2337 nucleotides. FIG. 7B shows the deduced amino acid sequence of SEQID NO:15 (SEQ ID NO:16) (Clone ID NO: pk111) of the present invention.The polypeptide sequence contains 779 amino acids. FIG. 7C shows thepolynucleotide sequences of the open reading frame of Clone ID NO:pk111-1 from Arabidopsis thaliana (SEQ ID NO:17) of the presentinvention. The polynucleotide sequence contains 1667 nucleotides. FIG.7D shows the deduced amino acid sequence of SEQ ID NO:17 (SEQ ID NO:18)(Clone ID NO: pk111-1) of the present invention. The polypeptidesequence contains 557 amino acids. The standard one-letter abbreviationfor amino acids is used to illustrate the deduced amino acid sequence.

FIGS. 8A-B: FIG. 8A shows the polynucleotide sequences of the openreading frame of Clone ID NO: pk113 from Arabidopsis thaliana (SEQ IDNO:19) of the present invention. The polynucleotide sequence contains1719 nucleotides. FIG. 8B shows the deduced amino acid sequence of SEQID NO:19 (SEQ ID NO:20) (Clone ID NO: pk113) of the present invention.The polypeptide sequence contains 573 amino acids. The standardone-letter abbreviation for amino acids is used to illustrate thededuced amino acid sequence.

FIGS. 9A-B: FIG. 9A shows the polynucleotide sequences of the openreading frame of Clone ID NO: pk114 from Arabidopsis thaliana (SEQ IDNO:21) of the present invention. The polynucleotide sequence contains894 nucleotides. FIG. 9B shows the deduced amino acid sequence of SEQ IDNO:21 (SEQ ID NO:22) (Clone ID NO: pk114) of the present invention. Thepolypeptide sequence contains 298 amino acids. The standard one-letterabbreviation for amino acids is used to illustrate the deduced aminoacid sequence.

FIGS. 10A-B: FIG. 116 shows the polynucleotide sequences of the openreading frame of Clone ID NO: pk116 from Arabidopsis thaliana (SEQ IDNO:23) of the present invention. The polynucleotide sequence contains411 nucleotides. FIG. 10B shows the deduced amino acid sequence of SEQID NO:23 (SEQ ID NO:24) (Clone ID NO: pk116) of the present invention.The polypeptide sequence contains 137 amino acids. The standardone-letter abbreviation for amino acids is used to illustrate thededuced amino acid sequence.

FIGS. 11A-B. FIG. 11A shows the polynucleotide sequences of the openreading frame of Clone ID NO: pk117 from Arabidopsis thaliana (SEQ IDNO:25) of the present invention. The polynucleotide sequence contains900 nucleotides. FIG. 11B shows the deduced amino acid sequence of SEQID NO:25 (SEQ ID NO:26) (Clone ID NO: pk117) of the present invention.The polypeptide sequence contains 300 amino acids. The standardone-letter abbreviation for amino acids is used to illustrate thededuced amino acid sequence.

FIGS. 12A-D. FIG. 12A shows the polynucleotide sequences of the openreading frame of Clone ID NO: pk118 from Arabidopsis thaliana (SEQ IDNO:27) of the present invention. The polynucleotide sequence contains2415 nucleotides. FIG. 12B shows the deduced amino acid sequence of SEQID NO:27 (SEQ ID NO:28) (Clone ID NO: pk118) of the present invention.The polypeptide sequence contains 805 amino acids. FIG. 12C shows thepolynucleotide sequences of the open reading frame of Clone ID NO:pk118-1 from Arabidopsis thaliana (SEQ ID NO:29) of the presentinvention. The polynucleotide sequence contains 2391 nucleotides. FIG.12D shows the deduced amino acid sequence of SEQ ID NO:29 (SEQ ID NO:30)(Clone ID NO: pk118-1) of the present invention. The polypeptidesequence contains 797 amino acids. The standard one-letter abbreviationfor amino acids is used to illustrate the deduced amino acid sequence.

FIGS. 13A-B. FIG. 13A shows the polynucleotide sequences of the openreading frame of Clone ID NO: pk120 from Arabidopsis thaliana (SEQ IDNO:31) of the present invention. The polynucleotide sequence contains1530 nucleotides. FIG. 13B shows the deduced amino acid sequence of SEQID NO:31 (SEQ ID NO:32) (Clone ID NO: pk120) of the present invention.The polypeptide sequence contains 510 amino acids. The standardone-letter abbreviation for amino acids is used to illustrate thededuced amino acid sequence.

DETAILED DESCRIPTION OF THE INVENTION

The present invention may be understood more readily by reference to thefollowing detailed description of the preferred embodiments of theinvention and the Examples included therein.

Before the present compounds, compositions, and methods are disclosedand described, it is to be understood that this invention is not limitedto specific nucleic acids, specific polypeptides, specific cell types,specific host cells, specific conditions, or specific methods, etc., assuch may, of course, vary, and the numerous modifications and variationstherein will be apparent to those skilled in the art. It is also to beunderstood that the terminology used herein is for the purpose ofdescribing particular embodiments only and is not intended to belimiting. As used in the specification and in the claims, “a” or “an”can mean one or more, depending upon the context in which it is used.Thus, for example, reference to “a cell” can mean that at least one cellcan be utilized.

In accordance with the purpose(s) of this invention, as embodied andbroadly described herein, this invention, in one aspect, provides anisolated nucleic acid from a plant (Arabidopsis thaliana) encoding aLipid Metabolism Protein (LMP), or a portion thereof.

One aspect of the invention pertains to isolated nucleic acid moleculesthat encode LMP polypeptides or biologically active portions thereof, aswell as nucleic acid fragments sufficient for use as hybridizationprobes or primers for the identification or amplification of anLMP-encoding nucleic acid (e.g., LMP DNA). As used herein, the term“nucleic acid molecule” is intended to include DNA molecules (e.g., cDNAor genomic DNA) and RNA molecules (e.g., mRNA) and analogs of the DNA orRNA generated using nucleotide analogs. This term also encompassesuntranslated sequence located at both the 3′ and 5′ ends of the codingregion of a gene: at least about 1000 nucleotides of sequence upstreamfrom the 5′ end of the coding region and at least about 200 nucleotidesof sequence downstream from the 3′ end of the coding region of the gene.The nucleic acid molecule can be single-stranded or double-stranded, butpreferably is double-stranded DNA. An “isolated” nucleic acid moleculeis one which is substantially separated from other nucleic acidmolecules which are present in the natural source of the nucleic acid.Preferably, an “isolated” nucleic acid is substantially free ofsequences which naturally flank the nucleic acid (i.e., sequenceslocated at the 5′ and 3′ ends of the nucleic acid) in the genomic DNA ofthe organism from which the nucleic acid is derived. For example, invarious embodiments, the isolated LMP nucleic acid molecule can containless than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb or 0.1 kb ofnucleotide sequences which naturally flank the nucleic acid molecule ingenomic DNA of the cell from which the nucleic acid is derived (e.g., aArabidopsis thaliana cell). Moreover, an “isolated” nucleic acidmolecule, such as a cDNA molecule, can be substantially free of othercellular material, or culture medium when produced by recombinanttechniques, or chemical precursors or other chemicals when chemicallysynthesized.

A nucleic acid molecule of the present invention, e.g., a nucleic acidmolecule having a nucleotide sequence shown in SEQ ID NO:1, SEQ ID NO:3,SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ IDNO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31, or a portionthereof, can be isolated using standard molecular biology techniques andthe sequence information provided herein. For example, an Arabidopsisthaliana LMP cDNA can be isolated from an Arabidopsis thaliana libraryusing all or portion of one of the sequences shown in SEQ ID NO:1, SEQID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ IDNO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ IDNO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31 as ahybridization probe and standard hybridization techniques (e.g., asdescribed in Sambrook et al. 1989, Molecular Cloning: A LaboratoryManual. 2nd, ed., Cold Spring Harbor Laboratory, Cold Spring HarborLaboratory Press, Cold Spring Harbor, N.Y.). Moreover, a nucleic acidmolecule encompassing all or a portion of one of the sequences shown inSEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ IDNO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ IDNO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ IDNO:31 can be isolated by the polymerase chain reaction usingoligonucleotide primers designed based upon this sequence (e.g., anucleic acid molecule encompassing all or a portion of one of thesequences shown in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7,SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ IDNO:29, or SEQ ID NO:31 can be isolated by the polymerase chain reactionusing oligonucleotide primers designed based upon this same sequenceshown in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ IDNO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ IDNO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ IDNO:29, or SEQ ID NO:31). For example, mRNA can be isolated from plantcells (e.g., by the guanidinium-thiocyanate extraction procedure ofChirgwin et al. 1979, Biochemistry 18:5294-5299) and cDNA can beprepared using reverse transcriptase (e.g., Moloney MLV reversetranscriptase, available from Gibco/BRL, Bethesda, Md.; or AMV reversetranscriptase, available from Seikagaku America, Inc., St. Petersburg,Fla.). Synthetic oligonucleotide primers for polymerase chain reactionamplification can be designed based upon one of the nucleotide sequencesshown in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ IDNO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ IDNO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ IDNO:29, or SEQ ID NO:31. A nucleic acid of the invention can be amplifiedusing cDNA or, alternatively, genomic DNA, as a template and appropriateoligonucleotide primers according to standard PCR amplificationtechniques. The nucleic acid so amplified can be cloned into anappropriate vector and characterized by DNA sequence analysis.Furthermore, oligonucleotides corresponding to a LMP nucleotide sequencecan be prepared by standard synthetic techniques, e.g., using anautomated DNA synthesizer.

In a preferred embodiment, an isolated nucleic acid of the inventioncomprises one of the nucleotide sequences shown in SEQ ID NO:1, SEQ IDNO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13,SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23,SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31. Thesepolynucleotides correspond to the Arabidopsis thaliana LMP cDNAs of theinvention. These cDNAs comprise sequences encoding LMPs (i.e., the“coding region”), as well as 5′ untranslated sequences and 3′untranslated sequences. Alternatively, the nucleic acid molecules cancomprise only the coding region of any of the the polynucleotidesequences described herein. Examples of polynucleotides comprising onlythe coding region or open reading frame (ORF) are shown in SEQ ID NO:1,SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ IDNO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ IDNO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31.

For the purposes of this application, it will be understood that each ofthe sequences set forth in in the Figures has an identifying entrynumber (e.g., pk109). Each of these sequences may generally comprisethree parts: a 5′ upstream region, a coding region, and a downstreamregion. The particular sequences shown in the figures represent the openreading frames. The putative functions of these proteins are indicatedin Table 3. In another preferred embodiment, an isolated nucleic acidmolecule of the present invention encodes a polypeptide that is able toparticipate in the metabolism of seed storage compounds such as lipids,starch and seed storage proteins and that contains an antioxidantdomain, a beta-oxidation domain, an acyltransferase domain, adehydrogenase domain, an ATP synthase domain, a kinase domain, anisocitrate lyase domain, a sucrose synthase domain, or amembrane-associated domain. Examples of isolated LMPs that contain suchdomains can be found in Table 4. LMPs containing an antioxidant domaininclude that shown in SEQ ID NO:1. LMPs containing a beta-oxidationdomain include that shown in SEQ ID NO:3. LMPs containing anacyltransferase domain include that shown in SEQ ID NO:5. LMPscontaining a dehydrogenase domain include those shown in SEQ ID NO:7 andSEQ ID NO:25. LMPs containing an ATP synthase domain include that shownin SEQ ID NO:21. LMPs containing a kinase domain include those shown inSEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:23, and SEQ ID NO:31.LMPs containing an isocitrate lyase domain include that shown in SEQ IDNO:19. LMPs containing a membrane-associated domain include those shownin SEQ ID NO:15 and SEQ ID NO:17.

In another preferred embodiment, an isolated nucleic acid molecule ofthe invention comprises a nucleic acid molecule which is a complement ofany of the nucleic acid sequences disclosed herein, including one of thenucleotide sequences shown in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ IDNO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ IDNO:27, SEQ ID NO:29, SEQ ID NO:31, or a portion thereof. As used herein,the term “complementary” refers to a nucleotide sequence that canhybridize to one of the nucleotide sequences shown in SEQ ID NO:1, SEQID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ IDNO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ IDNO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31.

In still another preferred embodiment, an isolated nucleic acid moleculeof the invention comprises a nucleotide sequence which is at least about50-60%, preferably at least about 60-70%, more preferably at least about70-80%, 80-90%, or 90-95%, and even more preferably at least about 95%,96%, 97%, 98%, 99% or more homologous to a nucleotide sequence shown inSEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ IDNO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ IDNO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ IDNO:31, or a portion thereof. In an additional preferred embodiment, anisolated nucleic acid molecule of the invention comprises a nucleotidesequence which hybridizes, e.g., hybridizes under stringent conditions,to one of the nucleotide sequences shown in SEQ ID NO:1, SEQ ID NO:3,SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ IDNO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31, or a portionthereof. These hybridization conditions include washing with a solutionhaving a salt concentration of about 0.02 molar at pH 7 at about 60° C.

Moreover, the nucleic acid molecule of the invention can comprise only aportion of the coding region of one of the sequences in SEQ ID NO:1, SEQID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ IDNO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ IDNO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31, forexample a fragment which can be used as a probe or primer or a fragmentencoding a biologically active portion of a LMP. The nucleotidesequences determined from the cloning of the LMP genes from Arabidopsisthaliana allows for the generation of probes and primers designed foruse in identifying and/or cloning LMP homologues in other cell types andorganisms, as well as LMP homologues from other plants or relatedspecies. Therefore this invention also provides compounds comprising thenucleic acids disclosed herein, or fragments thereof. These compoundsinclude the nucleic acids attached to a moiety. These moieties include,but arc not limited to, detection moieties, hybridization moieties,purification moieties, delivery moieties, reaction moieties, bindingmoieties, and the like. The probe/primer typically comprisessubstantially purified oligonucleotide. The oligonucleotide typicallycomprises a region of nucleotide sequence that hybridizes understringent conditions to at least about 12, preferably about 25, morepreferably about 40, 50 or 75 consecutive nucleotides of a sense strandof one of the sequences set forth in SEQ ID NO:1, SEQ ID NO:3, SEQ IDNO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ IDNO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ IDNO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31, an anti-sensesequence of one of the sequences set forth in SEQ ID NO:1, SEQ ID NO:3,SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ IDNO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31, or naturallyoccurring mutants thereof. Primers based on a nucleotide sequence shownin SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ IDNO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ IDNO:31 can be used in PCR reactions to clone LMP homologues. Probes basedon the LMP nucleotide sequences can be used to detect transcripts orgenomic sequences encoding the same or homologous proteins. In preferredembodiments, the probe further comprises a label group attached thereto,e.g. the label group can be a radioisotope, a fluorescent compound, anenzyme, or an enzyme co-factor. Such probes can be used as a part of agenomic marker test kit for identifying cells which express a LMP, suchas by measuring a level of a LMP-encoding nucleic acid in a sample ofcells, e.g., detecting LMP mRNA levels or determining whether a genomicLMP gene has been mutated or deleted.

In one embodiment, the nucleic acid molecule of the invention encodes aprotein or portion thereof which includes an amino acid sequence whichis sufficiently homologous to an amino acid encoded by a sequence ofshown in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ IDNO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ IDNO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ IDNO:30, or SEQ ID NO:32 such that the protein or portion thereofmaintains the same or a similar function as the wild-type protein. Asused herein, the language “sufficiently homologous” refers to proteinsor portions thereof which have amino acid sequences which include aminimum number of identical or equivalent (e.g., an amino acid residuewhich has a similar side chain as an amino acid residue in one of theORFs of a sequence shown in SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ IDNO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ ID NO:26, SEQ IDNO:28, SEQ ID NO:30, or SEQ ID NO:32) amino acid residues to an aminoacid sequence such that the protein or portion thereof is able toparticipate in the metabolism of compounds necessary for the productionof seed storage compounds in plants, construction of cellular membranesin microorganisms or plants, or in the transport of molecules acrossthese membranes. Regulatory proteins, such as DNA binding proteins,transcription factors, kinases, phosphatases, or protein members ofmetabolic pathways such as the lipid, starch and protein biosyntheticpathways, or membrane transport systems, may play a role in thebiosynthesis of seed storage compounds. Examples of such activities aredescribed herein (see putative annotations in Table 3). Examples ofLMP-encoding nucleic acid sequences are set forth in SEQ ID NO:1, SEQ IDNO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13,SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23,SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31.

As altered or increased sugar and/or fatty acid production is a generaltrait wished to be inherited into a wide variety of plants like maize,wheat, rye, oat, triticale, rice, barley, soybean, peanut, cotton,rapeseed, canola, manihot, pepper, sunflower and tagetes, solanaceousplants like potato, tobacco, eggplant, and tomato, Vicia species, pea,alfalfa, bushy plants (coffee, cacao, tea), Salix species, trees (oilpalm, coconut) and perennial grasses and forage crops, these crop plantsare also preferred target plants for genetic engineering as one furtherembodiment of the present invention.

Portions of proteins encoded by the LMP nucleic acid molecules of theinvention are preferably biologically active portions of one of theLMPs. As used herein, the term “biologically active portion of a LMP” isintended to include a portion, e.g., a domain/motif, of a LMP thatparticipates in the metabolism of compounds necessary for thebiosynthesis of seed storage lipids, or the construction of cellularmembranes in microorganisms or plants, or in the transport of moleculesacross these membranes, or has an activity as set forth in Table 3. Todetermine whether a LMP or a biologically active portion thereof canparticipate in the metabolism of compounds necessary for the productionof seed storage compounds and cellular membranes, an assay of enzymaticactivity may be performed. Such assay methods are well known to thoseskilled in the art, and as described in Example 14 of theExemplification.

Biologically active portions of a LMP include peptides comprising aminoacid sequences derived from the amino acid sequence of a LMP (e.g., anamino acid sequence encoded by a nucleic acid shown in SEQ ID NO:1, SEQID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ IDNO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ IDNO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31 or theamino acid sequence of a protein homologous to a LMP, which includefewer amino acids than a full length LMP or the full length proteinwhich is homologous to a LMP) and exhibit at least one activity of aLMP. Typically, biologically active portions (peptides, e.g., peptideswhich are, for example, 5, 10, 15, 20, 30, 35, 36, 37, 38, 39, 40, 50,100 or more amino acids in length) comprise a domain or motif with atleast one activity of a LMP. Moreover, other biologically activeportions, in which other regions of the protein are deleted, can beprepared by recombinant techniques and evaluated for one or more of theactivities described herein. Preferably, the biologically activeportions of a LMP include one or more selected domains/motifs orportions thereof having biological activity.

Additional nucleic acid fragments encoding biologically active portionsof a LMP can be prepared by isolating a portion of one of the sequences,expressing the encoded portion of the LMP or peptide (e.g., byrecombinant expression in vitro) and assessing the activity of theencoded portion of the LMP or peptide.

The invention further encompasses nucleic acid molecules that differfrom one of the nucleotide sequences shown in SEQ ID NO:1, SEQ ID NO:3,SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ IDNO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31 (and portionsthereof) due to degeneracy of the genetic code and thus encode the sameLMP as that encoded by the nucleotide sequences shown in the above SEQID Nos in the Figures. In a further embodiment, the nucleic acidmolecule of the invention encodes a full length protein which issubstantially homologous to an amino acid sequence of a polypeptideencoded by an open reading frame shown in SEQ ID NO:2, SEQ ID NO:4, SEQID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ IDNO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ ID NO:24, SEQ IDNO:26, SEQ ID NO:28, SEQ ID NO:30, or SEQ ID NO:32. In one embodiment,the full-length nucleic acid or protein or fragment of the nucleic acidor protein is from Arabidopsis thaliana.

In addition to the Arabidopsis thaliana LMP nucleotide sequences shownin SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ IDNO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ IDNO:31, it will be appreciated by those skilled in the art that DNAsequence polymorphisms that lead to changes in the amino acid sequencesof LMPs may exist within a population (e.g., the Arabidopsis thalianapopulation). Such genetic polymorphism in the LMP gene may exist amongindividuals within a population due to natural variation. As usedherein, the terms “gene” and “recombinant gene” refer to nucleic acidmolecules comprising an open reading frame encoding a LMP, preferably aArabidopsis thaliana LMP. Such natural variations can typically resultin 1-40% variance in the nucleotide sequence of the LMP gene. Any andall such nucleotide variations and resulting amino acid polymorphisms inLMP that are the result of natural variation and that do not alter thefunctional activity of LMPs arc intended to be within the scope of theinvention.

Nucleic acid molecules corresponding to natural variants andnon-Arabidopsis thaliana orthologs of the Arabidopsis thaliana LMP cDNAof the invention can be isolated based on their homology to Arabidopsisthaliana LMP nucleic acid disclosed herein using the Arabidopsisthaliana cDNA, or a portion thereof, as a hybridization probe accordingto standard hybridization techniques under stringent hybridizationconditions. As used herein, the term “orthologs” refers to two nucleicacids from different species, but that have evolved from a commonancestral gene by speciation. Normally, orthologs encode proteins havingthe same or similar functions. Accordingly, in another embodiment, anisolated nucleic acid molecule of the invention is at least 15nucleotides in length and hybridizes under stringent conditions to thenucleic acid molecule comprising a nucleotide sequence shown SEQ IDNO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11,SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21,SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31.In other embodiments, the nucleic acid is at least 30, 50, 100, 250 ormore nucleotides in length. As used herein, the term “hybridizes understringent conditions” is intended to describe conditions forhybridization and washing under which nucleotide sequences at least 60%homologous to each other typically remain hybridized to each other.Preferably, the conditions arc such that sequences at least about 65%,more preferably at least about 70%, and even more preferably at leastabout 75% or more homologous to each other typically remain hybridizedto each other. Such stringent conditions are known to those skilled inthe art and can be found in Current Protocols in Molecular Biology, JohnWiley & Sons, N.Y. (1989), 6.3.1-6.3.6. A preferred, non-limitingexample of stringent hybridization conditions are hybridization in 6×sodium chloride/sodium citrate (SSC) at about 45° C., followed by one ormore washes in 0.2×SSC, 0.1% SDS at 50-65° C. Preferably, an isolatednucleic acid molecule of the invention that hybridizes under stringentconditions to a sequence shown SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5,SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ IDNO:27, SEQ ID NO:29, or SEQ ID NO:31 corresponds to a naturallyoccurring nucleic acid molecule. As used herein, a “naturally-occurring”nucleic acid molecule refers to an RNA or DNA molecule having anucleotide sequence that occurs in nature (e.g., encodes a naturalprotein). In one embodiment, the nucleic acid encodes a naturalArabidopsis thaliana LMP.

In addition to naturally-occurring variants of the LMP sequence that mayexist in the population, the skilled artisan will further appreciatethat changes can be introduced by mutation into a nucleotide sequenceshown SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9,SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19,SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, orSEQ ID NO:31, thereby leading to changes in the amino acid sequence ofthe encoded LMP, without altering the functional ability of the LMP. Forexample, nucleotide substitutions leading to amino acid substitutions at“non-essential” amino acid residues can be made in a sequence shown inSEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ IDNO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ IDNO:22, SEQ ID NO:24. SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, or SEQ IDNO:32. A “non-essential” amino acid residue is a residue that can bealtered from the wild-type sequence of one of the LMPs (SEQ ID NO:2, SEQID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ IDNO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, SEQ ID NO:22, SEQ IDNO:24, SEQ ID NO:26, SEQ ID NO:28, SEQ ID NO:30, or SEQ ID NO:32)without altering the activity of said LMP, whereas an “essential” aminoacid residue is required for LMP activity. Other amino acid residues,however, (e.g., those that are not conserved or only semi-conserved inthe domain having LMP activity) may not be essential for activity andthus are likely to be amenable to alteration without altering LMPactivity.

Accordingly, another aspect of the invention pertains to nucleic acidmolecules encoding LMPs that contain changes in amino acid residues thatare not essential for LMP activity. Such LMPs differ in amino acidsequence from a sequence yet retain at least one of the LMP activitiesdescribed herein. In one embodiment, the isolated nucleic acid moleculecomprises a nucleotide sequence encoding a protein, wherein the proteincomprises an amino acid sequence at least about 50% homologous to anamino acid sequence encoded by a nucleic acid of SEQ ID NO:1, SEQ IDNO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13,SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23,SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31, and iscapable of participation in the metabolism of compounds necessary forthe production of seed storage compounds in Arabidopsis thaliana, orcellular membranes, or has one or more activities set forth in Table 3.Preferably, the protein encoded by the nucleic acid molecule is at leastabout 50-60% homologous to one of the sequences encoded by a nucleicacid of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9,SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19,SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, orSEQ ID NO:31, more preferably at least about 60-70% homologous to one ofthe sequences encoded by a nucleic acid of SEQ ID NO:1, SEQ ID NO:3, SEQID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ IDNO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ IDNO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31, even more preferablyat least about 70-80%, 80-90%, 90-95% homologous to one of the sequencesencoded by a nucleic acid of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ IDNO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ IDNO:27, SEQ ID NO:29, or SEQ ID NO:31, and most preferably at least about96%, 97%, 98%, or 99% homologous to one of the sequences encoded by anucleic acid of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ IDNO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ IDNO:29, or SEQ ID NO:31.

To determine the percent homology of two amino acid sequences (e.g., oneof the sequences encoded by a nucleic acid of SEQ ID NO:1, SEQ ID NO:3,SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ IDNO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31 and a mutant formthereof) or of two nucleic acids, the sequences are aligned for optimalcomparison purposes (e.g., gaps can be introduced in the sequence of oneprotein or nucleic acid for optimal alignment with the other protein ornucleic acid). The amino acid residues or nucleotides at correspondingamino acid positions or nucleotide positions are then compared. When aposition in one sequence (e.g., one of the sequences encoded by anucleic acid of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ IDNO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ IDNO:29, or SEQ ID NO:31) is occupied by the same amino acid residue ornucleotide as the corresponding position in the other sequence (e.g., amutant form of the sequence selected from the polypeptide encoded by anucleic acid of SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ IDNO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ IDNO:29, or SEQ ID NO:31), then the molecules are homologous at thatposition (i.e., as used herein amino acid or nucleic acid “homology” isequivalent to amino acid or nucleic acid “identity”). The percenthomology between the two sequences is a function of the number ofidentical positions shared by the sequences (i.e., % homology=numbers ofidentical positions/total numbers of positions×100).

An isolated nucleic acid molecule encoding a LMP homologous to a proteinsequence encoded by a nucleic acid shown in SEQ ID NO:1, SEQ ID NO:3,SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ IDNO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31 can be created byintroducing one or more nucleotide substitutions, additions or deletionsinto a nucleotide sequence SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ IDNO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ IDNO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ IDNO:27, SEQ ID NO:29, or SEQ ID NO:31 such that one or more amino acidsubstitutions, additions or deletions are introduced into the encodedprotein. Mutations can be introduced into one of the sequences SEQ IDNO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11,SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21,SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31by standard techniques, such as site-directed mutagenesis andPCR-mediated mutagenesis. Preferably, conservative amino acidsubstitutions are made at one or more predicted non-essential amino acidresidues. A “conservative amino acid substitution” is one in which theamino acid residue is replaced with an amino acid residue having asimilar side chain. Families of amino acid residues having similar sidechains have been defined in the art. These families include amino acidswith basic side chains (e.g., lysine, arginine, histidine), acidic sidechains (e.g., aspartic acid, glutamic acid), uncharged polar side chains(e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine,cysteine), nonpolar side chains (e.g., alanine, valine, leucine,isoleucine, proline, phenylalanine, methionine, tryptophan),beta-branched side chains (e.g., threonine, valine, isoleucine) andaromatic side chains (e.g., tyrosine, phenylalanine, tryptophan,histidine). Thus, a predicted non-essential amino acid residue in a LMPis preferably replaced with another amino acid residue from the sameside chain family. Alternatively, in another embodiment, mutations canbe introduced randomly along all or part of a LMP coding sequence, suchas by saturation mutagenesis, and the resultant mutants can be screenedfor a LMP activity described herein to identify mutants that retain LMPactivity. Following mutagenesis of one of the sequences SEQ ID NO:1, SEQID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ IDNO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ IDNO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31, theencoded protein can be expressed recombinantly and the activity of theprotein can be determined using, for example, assays described herein(see Examples 11-13 of the Exemplification).

LMPs are preferably produced by recombinant DNA techniques. For example,a nucleic acid molecule encoding the protein is cloned into anexpression vector (as described above), the expression vector isintroduced into a host cell (as described herein) and the LMP isexpressed in the host cell. The LMP can then be isolated from the cellsby an appropriate purification scheme using standard proteinpurification techniques. Alternative to recombinant expression, a LMP orpeptide thereof can be synthesized chemically using standard peptidesynthesis techniques. Moreover, native LMP can be isolated from cells,for example using an anti-LMP antibody, which can be produced bystandard techniques utilizing a LMP or fragment thereof of thisinvention.

The invention also provides LMP chimeric or fusion proteins. As usedherein, a LMP “chimeric protein” or “fusion protein” comprises a LMPpolypeptide operatively linked to a non-LMP polypeptide. An “LMPpolypeptide” refers to a polypeptide having an amino acid sequencecorresponding to a LMP, whereas a “non-LMP polypeptide” refers to apolypeptide having an amino acid sequence corresponding to a proteinwhich is not substantially homologous to the LMP, e.g., a protein whichis different from the LMP and which is derived from the same or adifferent organism. Within the fusion protein, the term “operativelylinked” is intended to indicate that the LMP polypeptide and the non-LMPpolypeptide are fused to each other so that both sequences fulfill theproposed function attributed to the sequence used. The non-LMPpolypeptide can be fused to the N-terminus or C-terminus of the LMPpolypeptide. For example, in one embodiment, the fusion protein is aGST-LMP (glutathione S-transferase) fusion protein in which the LMPsequences are fused to the C-terminus of the GST sequences. Such fusionproteins can facilitate the purification of recombinant LMPs. In anotherembodiment, the fusion protein is a LMP containing a heterologous signalsequence at its N-terminus. In certain host cells (e.g., mammalian hostcells), expression and/or secretion of a LMP can be increased throughuse of a heterologous signal sequence.

Preferably, a LMP chimeric or fusion protein of the invention isproduced by standard recombinant DNA techniques. For example, DNAfragments coding for the different polypeptide sequences are ligatedtogether in-frame in accordance with conventional techniques, forexample by employing blunt-ended or stagger-ended termini for ligation,restriction enzyme digestion to provide for appropriate termini,filling-in of cohesive ends as appropriate, alkaline phosphatasetreatment to avoid undesirable joining, and enzymatic ligation. Inanother embodiment, the fusion gene can be synthesized by conventionaltechniques including automated DNA synthesizers. Alternatively, PCRamplification of gene fragments can be carried out using anchor primerswhich give rise to complementary overhangs between two consecutive genefragments which can subsequently be annealed and reamplified to generatea chimeric gene sequence (see, for example, Current Protocols inMolecular Biology, eds. Ausubel et al., John Wiley & Sons: 1992).Moreover, many expression vectors are commercially available thatalready encode a fusion moiety (e.g., a GST polypeptide). AnLMP-encoding nucleic acid can be cloned into such an expression vectorsuch that the fusion moiety is linked in-frame to the LMP.

In addition to the nucleic acid molecules encoding LMPs described above,another aspect of the invention pertains to isolated nucleic acidmolecules which are antisense thereto. An “antisense” nucleic acidcomprises a nucleotide sequence which is complementary to a “sense”nucleic acid encoding a protein, e.g., complementary to the codingstrand of a double-stranded cDNA molecule or complementary to an mRNAsequence. Accordingly, an antisense nucleic acid can hydrogen bond to asense nucleic acid. The antisense nucleic acid can be complementary toan entire LMP coding strand, or to only a portion thereof. In oneembodiment, an antisense nucleic acid molecule is antisense to a “codingregion” of the coding strand of a nucleotide sequence encoding a LMP.The term “coding region” refers to the region of the nucleotide sequencecomprising codons which are translated into amino acid residues (e.g.,the entire coding region of Pk109 comprises nucleotides 1 to 1173). Inanother embodiment, the antisense nucleic acid molecule is antisense toa “noncoding region” of the coding strand of a nucleotide sequenceencoding LMP. The term “noncoding region” refers to 5′ and 3′ sequenceswhich flank the coding region that are not translated into amino acids(i.e., also referred to as 5′ and 3′ untranslated regions).

Given the coding strand sequences encoding LMP disclosed herein (e.g.,the sequences set forth in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ IDNO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ IDNO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ IDNO:27, SEQ ID NO:29, or SEQ ID NO:31), antisense nucleic acids of theinvention can be designed according to the rules of Watson and Crickbase pairing. The antisense nucleic acid molecule can be complementaryto the entire coding region of LMP mRNA, but more preferably is anoligonucleotide which is antisense to only a portion of the coding ornoncoding region of LMP mRNA. For example, the antisense oligonucleotidecan be complementary to the region surrounding the translation startsite of LMP mRNA. An antisense oligonucleotide can be, for example,about 5, 10, 15, 20, 25. 30, 35, 40, 45 or 50 nucleotides in length. Anantisense or sense nucleic acid of the invention can be constructedusing chemical synthesis and enzymatic ligation reactions usingprocedures known in the art. For example, an antisense nucleic acid(e.g., an antisense oligonucleotide) can be chemically synthesized usingnaturally occurring nucleotides or variously modified nucleotidesdesigned to increase the biological stability of the molecules or toincrease the physical stability of the duplex formed between theantisense and sense nucleic acids, e.g., phosphorothioate derivativesand acridine substituted nucleotides can be used. Examples of modifiednucleotides which can be used to generate the antisense nucleic acidinclude 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil,hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl)uracil, 5-carboxymethylamino-methyl-2-thiouridine,5-carboxymethylaminomethyluracil, dihydro-uracil,beta-D-galactosylqueosine, inosine, N-6-isopentenyladenine,1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine,2-methylguanine, 3-methylcytosine, 5-methyl-cytosine, N-6-adenine,7-methylguanine, 5-methylaminomethyluracil,5-methoxyamino-methyl-2-thiouracil, beta-D-mannosylqueosine,5′-methoxycarboxymethyl-uracil, 5-methoxyuracil,2-methylthio-N-6-isopentenyladenine, uracil-5-oxyacetic acid (v),wybutoxosine, pseudouracil, queosine, 2-thiocytosine,5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil,uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v),5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp1)w,and 2,6-diamino-purine. Alternatively, the antisense nucleic acid can beproduced biologically using an expression vector into which a nucleicacid has been subcloned in an antisense orientation (i.e., RNAtranscribed from the inserted nucleic acid will be of an antisenseorientation to a target nucleic acid of interest, described further inthe following subsection).

In another variation of the antisense technology, a double-strandinterfering RNA construct can be used to cause a down-regulation of theLMP mRNA level and LMP activity in transgenic plants. This requirestransforming the plants with a chimeric construct containing a portionof the LMP sequence in the sense orientation fused to the antisensesequence of the same portion of the LMP sequence. A DNA linker region ofvariable length can be used to separate the sense and antisensefragments of LMP sequences in the construct (see, for example Chuang &Meyerowitz 2000, Proc. Natl Acad Sci USA 97:4985-4990).

The antisense nucleic acid molecules of the invention are typicallyadministered to a cell or generated in silts such that they hybridizewith or bind to cellular mRNA and/or genomic DNA encoding a LMP tothereby inhibit expression of the protein, e.g., by inhibitingtranscription and/or translation. The hybridization can be byconventional nucleotide complementarity to form a stable duplex, or, forexample, in the case of an antisense nucleic acid molecule which bindsto DNA duplexes, through specific interactions in the major groove ofthe double helix. The antisense molecule can be modified such that itspecifically binds to a receptor or an antigen expressed on a selectedcell surface, e.g., by linking the antisense nucleic acid molecule to apeptide or an antibody which binds to a cell surface receptor orantigen. The antisense nucleic acid molecule can also be delivered tocells using the vectors described herein. To achieve sufficientintracellular concentrations of the antisense molecules, vectorconstructs in which the antisense nucleic acid molecule is placed underthe control of a strong prokaryotic, viral, or eukaryotic includingplant promoters are preferred.

In yet another embodiment, the antisense nucleic acid molecule of theinvention is an α-anomeric nucleic acid molecule. An α-anomeric nucleicacid molecule forms specific double-stranded hybrids with complementaryRNA in which, contrary to the usual β-units, the strands run parallel toeach other (Gaultier et al. 1987, Nucleic Acids Res. 15:6625-6641). Theantisense nucleic acid molecule can also comprise a2′-o-methylribonucleotide (Inoue et al. 1987, Nucleic Acids Res.15:6131-6148) or a chimeric RNA-DNA analogue (Inoue et al. 1987, FEBSLett 215:327-330).

In still another embodiment, an antisense nucleic acid of the inventionis a ribozyme. Ribozymes are catalytic RNA molecules with ribonucleaseactivity which are capable of cleaving a single-stranded nucleic acid,such as an mRNA, to which they have a complementary region. Thus,ribozymes (e.g., hammerhead ribozymes (described in Haselhoff & Gerlach1988, Nature 334:585-591)) can be used to catalytically cleave LMP mRNAtranscripts to thereby inhibit translation of LMP mRNA. A ribozymehaving specificity for a LMP-encoding nucleic acid can be designed basedupon the nucleotide sequence of a LMP cDNA disclosed herein (i.e., Pk109in FIG. 9A) or on the basis of a heterologous sequence to be isolatedaccording to methods taught in this invention. For example, a derivativeof a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotidesequence of the active site is complementary to the nucleotide sequenceto be cleaved in a LMP-encoding mRNA (see, e.g., Cech et al., U.S. Pat.No. 4,987,071 and Cech et al., U.S. Pat. No. 5,116,742). Alternatively,LMP mRNA can be used to select a catalytic RNA having a specificribonuclease activity from a pool of RNA molecules (see, e.g., Bartel,D. & Szostak J. W. 1993, Science 261:1411-1418).

Alternatively, LMP gene expression can be inhibited by targetingnucleotide sequences complementary to the regulatory region of a LMPnucleotide sequence (e.g., a LMP promoter and/or enhancers) to formtriple helical structures that prevent transcription of a LMP gene intarget cells (See generally, Helene C. 1991, Anticancer Drug Des.6:569-84; Helene C. et al. 1992, Ann. N.Y. Acad. Sci. 660:27-36; andMaher, L. J. 1992, Bioassays 14:807-15).

Another aspect of the invention pertains to vectors, preferablyexpression vectors, containing a nucleic acid encoding a LMP (or aportion thereof). As used herein, the term “vector” refers to a nucleicacid molecule capable of transporting another nucleic acid to which ithas been linked. One type of vector is a “plasmid”, which refers to acircular double stranded DNA loop into which additional DNA segments canbe ligated. Another type of vector is a viral vector, wherein additionalDNA segments can be ligated into the viral genome. Certain vectors arecapable of autonomous replication in a host cell into which they areintroduced (e.g., bacterial vectors having a bacterial origin ofreplication and episomal mammalian vectors). Other vectors (e.g.,non-episomal mammalian vectors) are integrated into the genome of a hostcell upon introduction into the host cell, and thereby are replicatedalong with the host genome. Moreover, certain vectors are capable ofdirecting the expression of genes to which they are operatively linked.Such vectors are referred to herein as “expression vectors”. In general,expression vectors of utility in recombinant DNA techniques are often inthe form of plasmids. In the present specification, “plasmid” and“vector” can be used inter-changeably as the plasmid is the mostcommonly used form of vector. However, the invention is intended toinclude such other forms of expression vectors, such as viral vectors(e.g., replication defective retroviruses, adenoviruses andadeno-associated viruses), which serve equivalent functions.

The recombinant expression vectors of the invention comprise a nucleicacid of the invention in a form suitable for expression of the nucleicacid in a host cell, which means that the recombinant expression vectorsinclude one or more regulatory sequences, selected on the basis of thehost cells to be used for expression, which is operatively linked to thenucleic acid sequence to be expressed. Within a recombinant expressionvector, “operably linked” is intended to mean that the nucleotidesequence of interest is linked to the regulatory sequence(s) in a mannerwhich allows for expression of the nucleotide sequence and bothsequences are fused to each other so that each fulfills its proposedfunction (e.g., in an in vitro transcription/translation system or in ahost cell when the vector is introduced into the host cell). The term“regulatory sequence” is intended to include promoters, enhancers andother expression control elements (e.g., polyadenylation signals). Suchregulatory sequences are described, for example, in Goeddel; GeneExpression Technology: Methods in Enzymology 185, Academic Press, SanDiego, Calif. (1990) or see: Gruber and Crosby, in: Methods in PlantMolecular Biology and Biotechnolgy, CRC Press, Boca Raton, Fla., eds.:Glick & Thompson, Chapter 7, 89-108 including the references therein.Regulatory sequences include those which direct constitutive expressionof a nucleotide sequence in many types of host cell and those whichdirect expression of the nucleotide sequence only in certain host cellsor under certain conditions. It will be appreciated by those skilled inthe art that the design of the expression vector can depend on suchfactors as the choice of the host cell to be transformed, the level ofexpression of protein desired, etc. The expression vectors of theinvention can be introduced into host cells to thereby produce proteinsor peptides, including fusion proteins or peptides, encoded by nucleicacids as described herein (e.g., LMPs, mutant forms of LMPs, fusionproteins, etc.).

The recombinant expression vectors of the invention can be designed forexpression of LMPs in prokaryotic or eukaryotic cells. For example, LMPgenes can be expressed in bacterial cells, insect cells (usingbaculovirus expression vectors), yeast and other fungal cells (seeRomanos M. A. et al. 1992, Foreign gene expression in yeast: a review,Yeast 8:423-488; van den Hondel, C. A. M. J. J. et al. 1991,Heterologous gene expression in filamentous fungi, in: More GeneManipulations in Fungi, Bennet & Lasure, eds., p. 396-428: AcademicPress: an Diego; and van den Hondel & Punt 1991, Gene transfer systemsand vector development for filamentous fungi, in: Applied MolecularGenetics of Fungi, Peberdy et al., eds., p. 1-28, Cambridge UniversityPress: Cambridge), algae (Falciatore et al. 1999, Marine Biotechnology1:239-251), ciliates of the types: Holotrichia, Peritrichia,Spirotrichia, Suctoria, Tetrahymena, Paramecium, Colpidium, Glaucoma,Platyophrya, Potomacus, Pseudocohnilembus, Euplotes, Engelmaniella, andStylonychia, especially of the genus Stylonychia lemnae with vectorsfollowing a transformation method as described in WO 98/01572 andmulticellular plant cells (see Schmidt & Willmitzer 1988, Highefficiency Agrobacterium tumefaciens-mediated transformation ofArabidopsis thaliana leaf and cotyledon plants, Plant Cell Rep.:583-586); Plant Molecular Biology and Biotechnology, C Press, BocaRaton, Fla., chapter 6/7, S.71-119 (1993); White, Jenes et al.,Techniques for Gene Transfer, in: Transgenic Plants, Vol. 1, Engineeringand Utilization, eds.: Kung and Wu, Academic Press 1993, 128-43;Potrykus 1991, Annu. Rev. Plant Physiol. Plant Mol. Biol. 42:205-225(and references cited therein) or mammalian cells. Suitable host cellsare discussed further in Goeddel, Gene Expression Technology: Methods inEnzymology 185, Academic Press, San Diego, Calif. 1990). Alternatively,the recombinant expression vector can be transcribed and translated invitro, for example using T7 promoter regulatory sequences and T7polymerase.

Expression of proteins in prokaryotes is most often carried out withvectors containing constitutive or inducible promoters directing theexpression of either fusion or non-fusion proteins. Fusion vectors add anumber of amino acids to a protein encoded therein, usually to the aminoterminus of the recombinant protein but also to the C-terminus or fusedwithin suitable regions in the proteins. Such fusion vectors typicallyserve one or more of the following purposes: 1) to increase expressionof recombinant protein; 2) to increase the solubility of the recombinantprotein; and 3) to aid in the purification of the recombinant protein byacting as a ligand in affinity purification. Often, in fusion expressionvectors, a proteolytic cleavage site is introduced at the junction ofthe fusion moiety and the recombinant protein to enable separation ofthe recombinant protein from the fusion moiety subsequent topurification of the fusion protein. Such enzymes, and their cognaterecognition sequences, include Factor Xa, thrombin and enterokinase.

Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc;Smith & Johnson 1988, Gene 67:31-40), pMAL (New England Biolabs,Beverly, Mass.) and pRIT5 (Pharmacia, Piscataway, N.J.) which fuseglutathione S-transferase (GST), maltose E binding protein, or proteinA, respectively, to the target recombinant protein. In one embodiment,the coding sequence of the LMP is cloned into a pGEX expression vectorto create a vector encoding a fusion protein comprising, from theN-terminus to the C-terminus, GST-thrombin cleavage site-X protein. Thefusion protein can be purified by affinity chromatography usingglutathione-agarose resin. Recombinant LMP unfused to GST can berecovered by cleavage of the fusion protein with thrombin.

Examples of suitable inducible non-fusion E. coil expression vectorsinclude pTrc (Amann et al. 1988, Gene 69:301-315) and pET 11d (Studieret al. 1990, Gene Expression Technology: Methods in Enzymology 185,Academic Press, San Diego, Calif. 60-89). Target gene expression fromthe pTrc vector relies on host RNA polymerase transcription from ahybrid trp-lac fusion promoter. Target gene expression from the pET 11dvector relies on transcription from a T7 gn10-lac fusion promotermediated by a coexpressed viral RNA polymerase (T7 gn1). This viralpolymerase is supplied by host strains BL21(DE3) or HMS174(DE3) from aresident λ prophage harboring a T7 gn1 gene under the transcriptionalcontrol of the lacUV 5 promoter.

One strategy to maximize recombinant protein expression is to expressthe protein in a host bacteria with an impaired capacity toproteolytically cleave the recombinant protein (Gottesman S. 1990, GeneExpression Technology: Methods in Enzymology 185:119-128, AcademicPress, San Diego, Calif.). Another strategy is to alter the nucleic acidsequence of the nucleic acid to be inserted into an expression vector sothat the individual codons for each amino acid are those preferentiallyutilized in the bacterium chosen for expression (Wada et al. 1992,Nucleic Acids Res. 20:2111-2118). Such alteration of nucleic acidsequences of the invention can be carried out by standard DNA synthesistechniques.

In another embodiment, the LMP expression vector is a yeast expressionvector. Examples of vectors for expression in yeast S. cerevisiaeinclude pYepSec1 (Baldari et al. 1987, Embo J. 6:229-234), pMFa (Kurjan& Herskowitz 1982, Cell 30:933-943), pJRY88 (Schultz et at. 1987, Gene54:113-123), and pYES2 (Invitrogen Corporation, San Diego, Calif.).Vectors and methods for the construction of vectors appropriate for usein other fungi, such as the filamentous fungi, include those detailedin: van den Hondel & Punt 1991, “Gene transfer systems and vectordevelopment for filamentous fungi, in: Applied Molecular Genetics ofFungi, Peberdy et al., eds., p. 1-28, Cambridge University Press:Cambridge.

Alternatively, the LMPs of the invention can be expressed in insectcells using baculovirus expression vectors. Baculovirus vectorsavailable for expression of proteins in cultured insect cells (e.g., Sf9 cells) include the pAc series (Smith et al. 1983, Mol. Cell Biol.3:2156-2165) and the pVL series (Lucklow & Summers 1989, Virology170:31-39).

In yet another embodiment, a nucleic acid of the invention is expressedin mammalian cells using a mammalian expression vector. Examples ofmammalian expression vectors include pCDM8 (Seed 1987, Nature 329:840)and pMT2PC (Kaufman et al. 1987, EMBO J. 6:187-195). When used inmammalian cells, the expression vector's control functions are oftenprovided by viral regulatory elements. For example, commonly usedpromoters are derived from polyoma, Adenovirus 2, cytomegalovirus andSimian Virus 40. For other suitable expression systems for bothprokaryotic and eukaryotic cells see chapters 16 and 17 of Sambrook,Fritsh and Maniatis, Molecular Cloning: A Laboratory Manual. 2nd, ed.,Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, ColdSpring Harbor, N.Y., 1989.

In another embodiment, the LMPs of the invention may be expressed inuni-cellular plant cells (such as algae, see Falciatore et al. (1999,Marine Biotechnology 1:239-251 and references therein) and plant cellsfrom higher plants (e.g., the spermatophytes, such as crop plants).Examples of plant expression vectors include those detailed in: Becker,Kemper, Schell and Masterson (1992, “New plant binary vectors withselectable markers located proximal to the left border”, Plant Mol.Biol. 20:1195 1197) and Bevan (1984, “Binary Agrobacterium vectors forplant transformation, Nucleic Acids Res. 12:8711-8721; Vectors for GeneTransfer in Higher Plants; in: Transgenic Plants, Vol. 1, Engineeringand Utilization, eds.: Kung and R. Wu, Academic Press, 1993, S. 15-38).

A plant expression cassette preferably contains regulatory sequencescapable to drive gene expression in plant cells and which are operablylinked so that each sequence can fulfil its function such as terminationof transcription such as polyadenylation signals. Preferredpolyadenylation signals are those originating from Agrobacteriumtumefaciens t-DNA such as the gene 3 known as octopine synthase of theTi-plasmid pTiACH5 (Gielen et al. 1984, EMBO J. 3:835) or functionalequivalents thereof but also all other terminators functionally activein plants are suitable.

As plant gene expression is very often not limited on transcriptionallevels a plant expression cassette preferably contains other operablylinked sequences like translational enhancers such as theoverdrive-sequence containing the 5′-untranslated leader sequence fromtobacco mosaic virus enhancing the protein per RNA ratio (Gallie et al.1987, Nucleic Acids Res. 15:8693-8711).

Plant gene expression has to be operably linked to an appropriatepromoter conferring gene expression in a timely, cell or tissue specificmanner. Preferred are promoters driving constitutive expression (Benfeyet al. 1989, EMBO J. 8:2195-2202) like those derived from plant viruseslike the 35S CAMV (Franck et al. 1980, Cell 21:285-294), the 19S CaMV(see also U.S. Pat. No. 5,352,605 and WO 84/02913) or plant promoterslike those from Rubisco small subunit described in U.S. Pat. No.4,962,028. Even more preferred are seed-specific promoters drivingexpression of LMP proteins during all or selected stages of seeddevelopment. Seed-specific plant promoters are known to those ofordinary skill in the art and are identified and characterized usingseed-specific mRNA libraries and expression profiling techniques.Seed-specific promoters include the napin-gene promoter from rapeseed(U.S. Pat. No. 5,608,152), the USP-promoter from Vicia faba (Baeumleinet al. 1991, Mol. Gen. Genetics 225:459-67), the oleosin-promoter fromArabidopsis (WO 98/45461), the phaseolin-promoter from Phaseolusvulgaris (U.S. Pat. No. 5,504,200), the Bce4-promoter from Brassica(WO9113980) or the legumin B4 promoter (LeB4; Baeumlein et al. 1992,Plant J. 2:233-239) as well as promoters conferring seed specificexpression in monocot plants like maize, barley, wheat, rye, rice etc.Suitable promoters to note are the 1pt2 or 1pt1-gene promoter frombarley (WO 95/15389 and WO 95/23230) or those described in WO 99/16890(promoters from the barley hordein-gene, the rice glutelin gene, therice oryzin gene, the rice prolamin gene, the wheat gliadin gene, wheatglutelin gene, the maize zein gene, the oat glutelin gene, the Sorghumkasirin-gene, the rye secalin gene).

Plant gene expression can also be facilitated via an inducible promoter(for review see Gatz 1997, Annu. Rev. Plant Physiol. Plant Mol. Biol.48:89-108). Chemically inducible promoters are especially suitable ifgene expression is desired in a time specific manner. Examples for suchpromoters are a salicylic acid inducible promoter (WO 95/19443), atetracycline inducible promoter (Gatz et al. 1992, Plant J. 2:397-404)and an ethanol inducible promoter (WO 93/21334).

Promoters responding to biotic or abiotic stress conditions are alsosuitable promoters such as the pathogen inducible PRP1-gene promoter(Ward et al., 1993, Plant. Mol. Biol. 22:361-366), the heat induciblehsp80-promoter from tomato (U.S. Pat. No. 5,187,267), cold induciblealpha-amylase promoter from potato (WO 96/12814) or the wound-induciblepinII-promoter (EP 375091).

Other preferred sequences for use in plant gene expression cassettes aretargeting sequences necessary to direct the gene-product in itsappropriate cell compartment (for review see Kermode 1996, Crit. Rev.Plant Sci. 15:285-423 and references cited therein) such as the vacuole,the nucleus, all types of plastids like amyloplasts, chloroplasts,chromoplasts, the extracellular space, mitochondria, the endoplasmicreticulum, oil bodies, peroxisomes and other compartments of plantcells. Also especially suited are promoters that confer plastid-specificgene expression, as plastids are the compartment where precursors andsome end products of lipid biosynthesis are synthesized. Suitablepromoters such as the viral RNA-polymerase promoter are described in WO95/16783 and WO 97/06250 and the clpP-promoter from Arabidopsisdescribed in WO 99/46394.

The invention further provides a recombinant expression vectorcomprising a DNA molecule of the invention cloned into the expressionvector in an antisense orientation. That is, the DNA molecule isoperatively linked to a regulatory sequence in a manner which allows forexpression (by transcription of the DNA molecule) of an RNA moleculewhich is antisense to LMP mRNA. Regulatory sequences operatively linkedto a nucleic acid cloned in the antisense orientation can be chosenwhich direct the continuous expression of the antisense RNA molecule ina variety of cell types, for instance viral promoters and/or enhancers,or regulatory sequences can be chosen which direct constitutive, tissuespecific or cell type specific expression of antisense RNA. Theantisense expression vector can be in the form of a recombinant plasmid,phagemid or attenuated virus in which antisense nucleic acids areproduced under the control of a high efficiency regulatory region, theactivity of which can be determined by the cell type into which thevector is introduced. For a discussion of the regulation of geneexpression using antisense genes see Weintraub et al. (1986, AntisenseRNA as a molecular tool for genetic analysis, Reviews—Trends inGenetics, Vol. 1) and Mol et al. (1990, FEDS Lett. 268:427-430).

Another aspect of the invention pertains to host cells into which arecombinant expression vector of the invention has been introduced. Theterms “host cell” and “recombinant host cell” are used interchangeablyherein. It is to be understood that such terms refer not only to theparticular subject cell but also to the progeny or potential progeny ofsuch a cell. Because certain modifications may occur in succeedinggenerations due to either mutation or environmental influences, suchprogeny may not, in fact, be identical to the parent cell, but are stillincluded within the scope of the term as used herein. A host cell can beany prokaryotic or eukaryotic cell. For example, a LMP can be expressedin bacterial cells, insect cells, fungal cells, mammalian cells (such asChinese hamster ovary cells (CHO) or COS cells), algae, ciliates orplant cells. Other suitable host cells are known to those skilled in theart.

Vector DNA can be introduced into prokaryotic or eukaryotic cells viaconventional transformation or transfection techniques. As used herein,the terms “transformation” and “transfection”, “conjugation” and“transduction” are intended to refer to a variety of art-recognizedtechniques for introducing foreign nucleic acid (e.g., DNA) into a hostcell, including calcium phosphate or calcium chloride co-precipitation,DEAE-dextran-mediated transfection, lipofection, natural competence,chemical-mediated transfer, or electroporation. Suitable methods fortransforming or transfecting host cells including plant cells can befound in Sambrook et al. (1989, Molecular Cloning: A Laboratory Manual.2nd, ed., Cold Spring Harbor Laboratory, Cold Spring Harbor LaboratoryPress, Cold Spring Harbor, N.Y.) and other laboratory manuals such asMethods in Molecular Biology 1995, Vol. 44, Agrobacterium protocols, ed:Gartland and Davey, Humana Press, Totowa, N.J.

For stable transfection of mammalian and plant cells, it is known that,depending upon the expression vector and transfection technique used,only a small fraction of cells may integrate the foreign DNA into theirgenome. In order to identify and select these integrants, a gene thatencodes a selectable marker (e.g., resistance to antibiotics) isgenerally introduced into the host cells along with the gene ofinterest. Preferred selectable markers include those which conferresistance to drugs, such as G418, hygromycin, kanamycin andmethotrexate or in plants that confer resistance towards an herbicidesuch as glyphosate or glufosinate. A nucleic acid encoding a selectablemarker can be introduced into a host cell on the same vector as thatencoding a LMP or can be introduced on a separate vector. Cells stablytransfected with the introduced nucleic acid can be identified by, forexample, drug selection (e.g., cells that have incorporated theselectable marker gene will survive, while the other cells die).

To create a homologous recombinant microorganism, a vector is preparedwhich contains at least a portion of a LMP gene into which a deletion,addition or substitution has been introduced to thereby alter, e.g.,functionally disrupt, the LMP gene. Preferably, this LMP gene is anArabidopsis thaliana LMP gene, but it can be a homologue from a relatedplant or even from a mammalian, yeast, or insect source. In a preferredembodiment, the vector is designed such that, upon homologousrecombination, the endogenous LMP gene is functionally disrupted (i.e.,no longer encodes a functional protein; also referred to as a knock-outvector). Alternatively, the vector can be designed such that, uponhomologous recombination, the endogenous LMP gene is mutated orotherwise altered but still encodes functional protein (e.g., theupstream regulatory region can be altered to thereby alter theexpression of the endogenous LMP). To create a point mutation viahomologous recombination, DNA-RNA hybrids can be used in a techniqueknown as chimeraplasty (Cole-Strauss et al. 1999, Nucleic Acids Res.27:1323-1330 and Kmiec 1999, American Scientist 87:240-247). Homologousrecombination procedures in Arabidopsis thaliana are also well known inthe art and are contemplated for use herein.

In a homologous recombination vector, the altered portion of the LMPgene is flanked at its 5′ and 3′ ends by additional nucleic acid of theLMP gene to allow for homologous recombination to occur between theexogenous LMP gene carried by the vector and an endogenous LMP gene in amicroorganism or plant. The additional flanking LMP nucleic acid is ofsufficient length for successful homologous recombination with theendogenous gene. Typically, several hundreds of base pairs up tokilobases of flanking DNA (both at the 5′ and 3′ ends) are included inthe vector (see e.g., Thomas & Capecchi 1987, Cell 51:503, for adescription of homologous recombination vectors). The vector isintroduced into a microorganism or plant cell (e.g., viapolyethyleneglycol mediated DNA). Cells in which the introduced LMP genehas homologously recombined with the endogenous LMP gene are selectedusing art-known techniques.

In another embodiment, recombinant microorganisms can be produced whichcontain selected systems which allow for regulated expression of theintroduced gene. For example, inclusion of a LMP gene on a vectorplacing it under control of the lac operon permits expression of the LMPgene only in the presence of IPTG. Such regulatory systems are wellknown in the art.

A host cell of the invention, such as a prokaryotic or eukaryotic hostcell in culture can be used to produce (i.e., express) a LMP.Accordingly, the invention further provides methods for producing LMPsusing the host cells of the invention. In one embodiment, the methodcomprises culturing a host cell of the invention (into which arecombinant expression vector encoding a LMP has been introduced, orwhich contains a wild-type or altered LMP gene in it's genome) in asuitable medium until LMP is produced. In another embodiment, the methodfurther comprises isolating LMPs from the medium or the host cell.

Another aspect of the invention pertains to isolated LMPs, andbiologically active portions thereof. An “isolated” or “purified”protein or biologically active portion thereof is substantially free ofcellular material when produced by recombinant DNA techniques, orchemical precursors or other chemicals when chemically synthesized. Thelanguage “substantially free of cellular material” includes preparationsof LMP in which the protein is separated from cellular components of thecells in which it is naturally or recombinantly produced. In oneembodiment, the language “substantially free of cellular material”includes preparations of LMP having less than about 30% (by dry weight)of non-LMP (also referred to herein as a “contaminating protein”), morepreferably less than about 20% of non-LMP, still more preferably lessthan about 10% of non-LMP, and most preferably less than about 5%non-LMP. When the LMP or biologically active portion thereof isrecombinantly produced, it is also preferably substantially free ofculture medium, i.e., culture medium represents less than about 20%,more preferably less than about 10%, and most preferably less than about5% of the volume of the protein preparation. The language “substantiallyfree of chemical precursors or other chemicals” includes preparations ofLMP in which the protein is separated from chemical precursors or otherchemicals which are involved in the synthesis of the protein. In oneembodiment, the language “substantially free of chemical precursors orother chemicals” includes preparations of LMP having less than about 30%(by dry weight) of chemical precursors or non-LMP chemicals, morepreferably less than about 20% chemical precursors or non-LMP chemicals,still more preferably less than about 10% chemical precursors or non-LMPchemicals, and most preferably less than about 5% chemical precursors ornon-LMP chemicals. In preferred embodiments, isolated proteins orbiologically active portions thereof lack contaminating proteins fromthe same organism from which the LMP is derived. Typically, suchproteins are produced by recombinant expression of, for example, anArabidopsis thaliana LMP in other plants than Arabidopsis thaliana ormicroorganisms, algae or fungi.

An isolated LMP or a portion thereof of the invention can participate inthe metabolism of compounds necessary for the production of seed storagecompounds in Arabidopsis thaliana, or of cellular membranes, or has oneor more of the activities set forth in Table 3. In preferredembodiments, the protein or portion thereof comprises an amino acidsequence which is sufficiently homologous to an amino acid sequenceencoded by a nucleic acid of the Figures such that the protein orportion thereof maintains the ability to participate in the metabolismof compounds necessary for the construction of cellular membranes inArabidopsis thaliana, or in the transport of molecules across thesemembranes. The portion of the protein is preferably a biologicallyactive portion as described herein. In another preferred embodiment, aLMP of the invention has an amino acid sequence encoded by a nucleicacid shown in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ IDNO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ IDNO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ IDNO:29, or SEQ ID NO:31. In yet another preferred embodiment, the LMP hasan amino acid sequence which is encoded by a nucleotide sequence whichhybridizes, e.g., hybridizes under stringent conditions, to a nucleotidesequence shown in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7,SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ IDNO:29, or SEQ ID NO:31. In still another preferred embodiment, the LMPhas an amino acid sequence which is encoded by a nucleotide sequencethat is at least about 50-60%, preferably at least about 60-70%, morepreferably at least about 70-80%, 80-90%, 90-95%, and even morepreferably at least about 96%, 97%, 98%, 99% or more homologous to oneof the amino acid sequences encoded by a nucleic acid shown in SEQ IDNO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11,SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21,SEQ ID NO:23, SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31.The preferred LMPs of the present invention also preferably possess atleast one of the LMP activities described herein. For example, apreferred LMP of the present invention includes an amino acid sequenceencoded by a nucleotide sequence which hybridizes, e.g., hybridizesunder stringent conditions, to a nucleotide sequence shown in SEQ IDNO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11,SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21,SEQ ID NO:23, SEQ Ill NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31of the Figures, and which can participate in the metabolism of compoundsnecessary for the construction of cellular membranes in Arabidopsisthaliana, or in the transport of molecules across these membranes, orwhich has one or more of the activities set forth in Table 3.

In other embodiments, the LMP is substantially homologous to an aminoacid sequence encoded by a nucleic acid shown in SEQ ID NO:1, SEQ IDNO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13,SEQ IT) NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23,SEQ ID NO:25, SEQ ID NO:27, SEQ ID NO:29, or SEQ ID NO:31 and retainsthe functional activity of the protein of one of the sequences encodedby a nucleic acid shown in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ IDNO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ IDNO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ IDNO:27, SEQ ID NO:29, or SEQ ID NO:31 yet differs in amino acid sequencedue to natural variation or mutagenesis, as described in detail above.Accordingly, in another embodiment, the LMP is a protein which comprisesan amino acid sequence which is at least about 50-60%, preferably atleast about 60-70%, and more preferably at least about 70-80, 80-90,90-95%, and most preferably at least about 96%, 97%, 98%, 99% or morehomologous to an entire amino acid sequence and which has at least oneof the LMP activities described herein. In another embodiment, theinvention pertains to a full Arabidopsis thaliana protein which issubstantially homologous to an entire amino acid sequence encoded by anucleic acid shown in SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ IDNO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ IDNO:17, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:23, SEQ ID NO:25, SEQ IDNO:27, SEQ ID NO:29, or SEQ ID NO:31.

Homologues of the LMP can be generated by mutagenesis, e.g., discretepoint mutation or truncation of the LMP. As used herein, the term“homologue” refers to a variant form of the LMP which acts as an agonistor antagonist of the activity of the LMP. An agonist of the LMP canretain substantially the same, or a subset, of the biological activitiesof the LMP. An antagonist of the LMP can inhibit one or more of theactivities of the naturally occurring form of the LMP, by, for example,competitively binding to a downstream or upstream member of the cellmembrane component metabolic cascade which includes the LMP, or bybinding to a LMP which mediates transport of compounds across suchmembranes, thereby preventing translocation from taking place.

In an alternative embodiment, homologues of the LMP can be identified byscreening combinatorial libraries of mutants, e.g., truncation mutants,of the LMP for LMP agonist or antagonist activity. In one embodiment, avariegated library of LMP variants is generated by combinatorialmutagenesis at the nucleic acid level and is encoded by a variegatedgene library. A variegated library of LMP variants can be produced by,for example, enzymatically ligating a mixture of syntheticoligonucleotides into gene sequences such that a degenerate set ofpotential LMP sequences is expressible as individual polypeptides, oralternatively, as a set of larger fusion proteins (e.g., for phagedisplay) containing the set of LMP sequences therein. There are avariety of methods which can be used to produce libraries of potentialLMP homologues from a degenerate oligonucleotide sequence. Chemicalsynthesis of a degenerate gene sequence can be performed in an automaticDNA synthesizer, and the synthetic gene then ligated into an appropriateexpression vector. Use of a degenerate set of genes allows for theprovision, in one mixture, of all of the sequences encoding the desiredset of potential LMP sequences. Methods for synthesizing degenerateoligonucleotides are known in the art (see, e.g., Narang 1983,Tetrahedron 39:3; Itakura et al. 1984, Annu. Rev. Biochem. 53:323;Itakura et al. 1984, Science 198:1056; Ike et al. 1983, Nucleic AcidsRes. 11:477).

In addition, libraries of fragments of the LMP coding sequences can beused to generate a variegated population of LMP fragments for screeningand subsequent selection of homologues of a LMP. In one embodiment, alibrary of coding sequence fragments can be generated by treating adouble stranded PCR fragment of a LMP coding sequence with a nucleaseunder conditions wherein nicking occurs only about once per molecule,denaturing the double stranded DNA, renaturing the DNA to form doublestranded DNA which can include sense/antisense pairs from differentnicked products, removing single stranded portions from reformedduplexes by treatment with S1 nuclease, and ligating the resultingfragment library into an expression vector. By this method, anexpression library can be derived which encodes N-terminal, C-terminaland internal fragments of various sizes of the LMP.

Several techniques are known in the art for screening gene products ofcombinatorial libraries made by point mutations or truncation, and forscreening cDNA libraries for gene products having a selected property.Such techniques are adaptable for rapid screening of the gene librariesgenerated by the combinatorial mutagenesis of LMP homologues. The mostwidely used techniques, which are amenable to high through-put analysis,for screening large gene libraries typically include cloning the genelibrary into replicable expression vectors, transforming appropriatecells with the resulting library of vectors, and expressing thecombinatorial genes under conditions in which detection of a desiredactivity facilitates isolation of the vector encoding the gene whoseproduct was detected. Recursive ensemble mutagenesis (REM), a newtechnique which enhances the frequency of functional mutants in thelibraries, can be used in combination with the screening assays toidentify LMP homologues (Arkin & Yourvan 1992, Proc. Natl. Acad. Sci.USA 89:7811-7815; Delgrave et al. 1993, Protein Engineering 6:327-331).

In another embodiment, cell based assays can be exploited to analyze avariegated LMP library, using methods well known in the art.

The nucleic acid molecules, proteins, protein homologues, fusionproteins, primers, vectors, and host cells described herein can be usedin one or more of the following methods: identification of Arabidopsisthaliana and related organisms; mapping of genomes of organisms relatedto Arabidopsis thaliana; identification and localization of Arabidopsisthaliana sequences of interest; evolutionary studies; determination ofLMP regions required for function; modulation of a LMP activity;modulation of the metabolism of one or more cell functions; modulationof the transmembrane transport of one or more compounds; and modulationof seed storage compound accumulation.

The plant Arabidopsis thaliana represents one member of higher (or seed)plants. It is related to other plants such as Brassica napus or soybeanwhich require light to drive photosynthesis and growth. Plants likeArabidopsis thaliana and Brassica napus share a high degree of homologyon the DNA sequence and polypeptide level, allowing the use ofheterologous screening of DNA molecules with probes evolving from otherplants or organisms, thus enabling the derivation of a consensussequence suitable for heterologous screening or functional annotationand prediction of gene functions in third species. The ability toidentify such functions can therefore have significant relevance, e.g.,prediction of substrate specificity of enzymes. Further, these nucleicacid molecules may serve as reference points for the mapping ofArabidopsis genomes, or of genomes of related organisms.

The LMP nucleic acid molecules of the invention have a variety of uses.First, they may be used to identify an organism as being Arabidopsisthaliana or a close relative thereof. Also, they may be used to identifythe presence of Arabidopsis thaliana or a relative thereof in a mixedpopulation of microorganisms. The invention provides the nucleic acidsequences of a number of Arabidopsis thaliana genes; by probing theextracted genomic DNA of a culture of a unique or mixed population ofmicroorganisms under stringent conditions with a probe spanning a regionof an Arabidopsis thaliana gene which is unique to this organism, onecan ascertain whether this organism is present.

Further, the nucleic acid and protein molecules of the invention mayserve as markers for specific regions of the genome. This has utilitynot only in the mapping of the genome, but also for functional studiesof Arabidopsis thaliana proteins. For example, to identify the region ofthe genome to which a particular Arabidopsis thaliana DNA-bindingprotein binds, the Arabidopsis thaliana genome could be digested, andthe fragments incubated with the DNA-binding protein. Those which bindthe protein may be additionally probed with the nucleic acid moleculesof the invention, preferably with readily detectable labels; binding ofsuch a nucleic acid molecule to the genome fragment enables thelocalization of the fragment to the genome map of Arabidopsis thaliana,and, when performed multiple times with different enzymes, facilitates arapid determination of the nucleic acid sequence to which the proteinbinds. Further, the nucleic acid molecules of the invention may besufficiently homologous to the sequences of related species such thatthese nucleic acid molecules may serve as markers for the constructionof a genomic map in related plants.

The LMP nucleic acid molecules of the invention are also useful forevolutionary and protein structural studies. The metabolic and transportprocesses in which the molecules of the invention participate areutilized by a wide variety of prokaryotic and eukaryotic cells; bycomparing the sequences of the nucleic acid molecules of the presentinvention to those encoding similar enzymes from other organisms, theevolutionary relatedness of the organisms can be assessed. Similarly,such a comparison permits an assessment of which regions of the sequenceare conserved and which are not, which may aid in determining thoseregions of the protein which are essential for the functioning of theenzyme. This type of determination is of value for protein engineeringstudies and may give an indication of what the protein can tolerate interms of mutagenesis without losing function.

Manipulation of the LMP nucleic acid molecules of the invention mayresult in the production of LMPs having functional differences from thewild-type LMPs. These proteins may be improved in efficiency oractivity, may be present in greater numbers in the cell than is usual,or may be decreased in efficiency or activity.

There are a number of mechanisms by which the alteration of a LMP of theinvention may directly affect the accumulation of seed storagecompounds. In the case of plants expressing LMPs, increased transportcan lead to altered accumulation of compounds and/or solute partitioningwithin the plant tissue and organs which ultimately could be used toaffect the accumulation of one or more seed storage compounds duringseed development. An example is provided by Mitsukawa et al. (1997,Proc. Natl. Acad. Sci. USA 94:7098-7102), where over expression of anArabidopsis high-affinity phosphate transporter gene in tobacco culturedcells enhanced cell growth under phosphate-limited conditions. Phosphateavailability also affects significantly the production of sugars andmetabolic intermediates (Hurry et al. 2000, Plant J. 24:383-396) and thelipid composition in leaves and roots (Härtel et al. 2000, Proc. Natl.Acad. Sci. USA 97:10649-10654). Likewise, the activity of the plantACCase has been demonstrated to be regulated by phosphorylation (Savage& Ohlrogge 1999, Plant J. 18:521-527) and alterations in the activity ofthe kinases and phosphatases (LMPs) that act on the ACCase could lead toincreased or decreased levels of seed lipid accumulation. Moreover, thepresence of lipid kinase activities in chloroplast envelope membranessuggests that signal transduction pathways and/or membrane proteinregulation occur in envelopes (see, e.g., Müller et al. 2000, J. Biol.Chem. 275:19475-19481 and literature cited therein). The ABI1 and ABI2genes encode two protein serine/threonine phosphatases 2C, which areregulators in abscisic acid signaling pathway, and thereby in early andlate seed development (e.g. Merlot et al. 2001, Plant J. 25:295-303).

The present invention also provides antibodies which specifically bindto an LMP-polypeptide, or a portion thereof, as encoded by a nucleicacid disclosed herein or as described herein. Antibodies can be made bymany well-known methods (see, e.g. Harlow and Lane, “Antibodies; ALaboratory Manual” Cold Spring Harbor Laboratory, Cold Spring Harbor,N.Y., 1988). Briefly, purified antigen can be injected into an animal inan amount and in intervals sufficient to elicit an immune response.Antibodies can either be purified directly, or spleen cells can beobtained from the animal. The cells can then fused with an immortal cellline and screened for antibody secretion. The antibodies can be used toscreen nucleic acid clone libraries for cells secreting the antigen.Those positive clones can then be sequenced (see, for example, Kelly etal. 1992, Bio/Technology 10:163-167; Bebbington et al. 1992,Bio/Technology 10:169-175).

The phrase “selectively binds” with the polypeptide refers to a bindingreaction which is determinative of the presence of the protein in aheterogeneous population of proteins and other biologics. Thus, underdesignated immunoassay conditions, the specified antibodies bound to aparticular protein do not bind in a significant amount to other proteinspresent in the sample. Selective binding to an antibody under suchconditions may require an antibody that is selected for its specificityfor a particular protein. A variety of immunoassay formats may be usedto select antibodies that selectively bind with a particular protein.For example, solid-phase ELISA immunoassays are routinely used to selectantibodies selectively immunoreactive with a protein. See Harlow andLane “Antibodies, A Laboratory Manual” Cold Spring Harbor Publications,New York (1988), for a description of immunoassay formats and conditionsthat could be used to determine selective binding.

In some instances, it is desirable to prepare monoclonal antibodies fromvarious hosts. A description of techniques for preparing such monoclonalantibodies may be found in Stites et al., editors, “Basic and ClinicalImmunology,” (Lange Medical Publications, Los Altos, Calif., FourthEdition) and references cited therein, and in Harlow and Lane(“Antibodies, A Laboratory Manual” Cold Spring Harbor Publications, NewYork, 1988).

Throughout this application, various publications are referenced. Thedisclosures of all of these publications and those references citedwithin those publications in their entireties are hereby incorporated byreference into this application in order to more fully describe thestate of the art to which this invention pertains.

It will be apparent to those skilled in the art that variousmodifications and variations can be made in the present inventionwithout departing from the scope or spirit of the invention. Otherembodiments of the invention will be apparent to those skilled in theart from consideration of the specification and practice of theinvention disclosed herein. It is intended that the specification andExamples be considered as exemplary only, with a true scope and spiritof the invention being indicated by the claims included herein.

Examples Example 1 General Processes a) General Cloning Processes:

Cloning processes such as, for example, restriction cleavages, agarosegel electrophoresis, purification of DNA fragments, transfer of nucleicacids to nitrocellulose and nylon membranes, linkage of DNA fragments,transformation of Escherichia coli and yeast cells, growth of bacteriaand sequence analysis of recombinant DNA were carried out as describedin Sambrook et al. (1989, Cold Spring Harbor Laboratory Press: ISBN0-87969-309-6) or Kaiser, Michaelis and Mitchell (1994, “Methods inYeast Genetics”, Cold Spring Harbor Laboratory Press: ISBN0-87969-451-3).

b) Chemicals:

The chemicals used were obtained, if not mentioned otherwise in thetext, in p.a. quality from the companies Fluka (Neu-Ulm), Merck(Darmstadt), Roth (Karlsruhe), Serva (Heidelberg) and Sigma(Deisenhofen). Solutions were prepared using purified, pyrogen-freewater, designated as H₂O in the following text, from a Milli-Q watersystem water purification plant (Millipore, Eschborn). Restrictionendonucleases, DNA-modifying enzymes and molecular biology kits wereobtained from the companies AGS (Heidelberg), Amersham (Braunschweig),Biometra (Göttingen), Boehringer (Mannheim), Genomed (Bad Oeynnhausen),New England Biolabs (Schwalbach/Taunus), Novagen (Madison, Wis., USA),Perkin-Elmer (Weiterstadt), Pharmacia (Freiburg), Qiagen (Hilden) andStratagene (Amsterdam, Netherlands). They were used, if not mentionedotherwise, according to the manufacturer's instructions.

c) Plant Material:

For this study, in one series of experiments, root material of wild-typeand pickle mutant plants of Arabidopsis thaliana were used. The pklmutant was isolated from an ethyl methanesulfonate-mutagenizedpopulation of the Columbia ecotype as described (Ogas et al. 1997,Science 277:91-94; Ogas et al. 1999, Proc. Natl. Acad. Sci. USA96:13839-13844). In other series of experiments, siliques of individualecotypes of Arabidopsis thaliana and of selected Arabidopsisphytohormone mutants were used. Seeds were obtained from the Arabidopsisstock center.

d) Plant Growth:

Plants were either grown on Murashige-Skoog medium as described in Ogaset al. (1997, Science 277:91-94; 1999, Proc. Natl. Acad. Sci. USA96:13839-13844) or on soil under standard conditions as described inFocks & Benning (1998, Plant Physiol. 118:91-101).

Example 2 Total DNA Isolation from Plants

The details for the isolation of total DNA relate to the working up ofone gram fresh weight of plant material.

CTAB buffer: 2% (w/v) N-cethyl-N,N,N-trimethylammonium bromide (CTAB);100 mM Tris HCl pH 8.0; 1.4 M NaCl; 20 mM EDTA. N-Laurylsarcosinebuffer: 10% (w/v) N-laurylsarcosine; 100 mM Tris HCl pH 8.0; 20 mM EDTA.

The plant material was triturated under liquid nitrogen in a mortar togive a fine powder and transferred to 2 ml Eppendorf vessels. The frozenplant material was then covered with a layer of 1 ml of decompositionbuffer (1 ml CTAB buffer, 100 μl of N-laurylsarcosine buffer, 20 μl ofβ-mercaptoethanol and 10 μl of proteinase K solution, 10 mg/ml) andincubated at 60° C. for one hour with continuous shaking. The homogenateobtained was distributed into two Eppendorf vessels (2 ml) and extractedtwice by shaking with the same volume of chloroform/isoamyl alcohol(24:1). For phase separation, centrifugation was carried out at 8000 gand RT for 15 min in each case. The DNA was then precipitated at −70° C.for 30 min using ice-cold isopropanol. The precipitated DNA wassedimented at 4° C. and 10,000 g for 30 min and resuspended in 180 μl ofTE buffer (Sambrook et al. 1989, Cold Spring Harbor Laboratory Press:ISBN 0-87969-309-6). For further purification, the DNA was treated withNaCl (1.2 M final concentration) and precipitated again at −70° C. for30 min using twice the volume of absolute ethanol. After a washing stepwith 70% ethanol, the DNA was dried and subsequently taken up in 50 μlof H₂O+RNAse (50 mg/ml final concentration). The DNA was dissolvedovernight at 4° C. and the RNAse digestion was subsequently carried outat 37° C. for 1 h. Storage of the DNA took place at 4° C.

Example 3 Isolation of Total RNA and poly-(A)+RNA from Plants

For the investigation of transcripts, both total RNA and poly-(A)⁺ RNAwere isolated. RNA is isolated from siliques of Arabidopsis plantsaccording to the following procedure:

RNA Preparation from Arabidopsis Seeds—“Hot” Extraction:

1. Buffers, Enzymes and Solution

-   -   2M KCl    -   Proteinase K    -   Phenol (for RNA) Chloroform:Isoamylalcohol    -   (Phenol:choloroform 1:1; pH adjusted for RNA)    -   4 M LiCl, DEPC-treated    -   DEPC-treated water    -   3M NaOAc, pH 5, DEPC-treated    -   Isopropanol    -   70% ethanol (made up with DEPC-treated water)    -   Resuspension buffer: 0.5% SDS, 10 mM Tris pH 7.5, 1 mM EDTA made        up with DEPC-treated water as this solution can not he        DEPC-treated    -   Extraction Buffer:    -   0.2M Na Borate    -   30 mM EDTA    -   30 mM EGTA    -   1% SDS (250 μl of 10% SDS-solution for 2.5 ml buffer)    -   1% Deoxycholate (25 mg for 2.5 ml buffer)    -   2% PVPP (insoluble—50 mg for 2.5 ml buffer)    -   2% PVP 40K (50 mg for 2.5 ml buffer)    -   10 mM DTT    -   100 mM β-Mercaptoethanol (fresh, handle under fume hood—use 35        μl of 14.3M solution for 5 ml buffer)

2. Extraction

Heat extraction buffer up to 80° C. Grind tissue in liquidnitrogen-cooled mortar, transfer tissue powder to 1.5 ml tube. Tissueshould kept frozen until buffer is added so transfer the sample withpre-cooled spatula and keep the tube in liquid nitrogen all time. Add350 μl preheated extraction buffer (here for 100 mg tissue, buffervolume can be as much as 500 μl for bigger samples) to tube, vortex andheat tube to 80° C. for ˜1 min. Keep then on ice. Vortex sample, grindadditionally with electric mortar.

3. Digestion

Add Proteinase K (0.15 mg/100 mg tissue), vortex and keep at 37° C. forone hour.

4. First Purification

Add 27 μl 2 M KCl. Chill on ice for 10 min. Centrifuge at 12.000 rpm for10 minutes at room temperature. Transfer supernatant to fresh,RNAase-free tube and do one phenol extraction, followed by acholoroform:isoamylalcohol extraction. Add 1 vol. isopropanol tosupernatant and chill on ice for 10 min. Pellet RNA by centrifugation(7000 rpm for 10 min at RT). Resolve pellet in 1 ml 4M LiCl by 10 to 15min vortexing. Pellet RNA by 5 min centrifugation.

5. Second Purification

Resuspend pellet in 500 μl Resuspension buffer. Add 500 μl phenol andvortex. Add 250 μl chloroform:isoamylalcohol and vortex. Spin for 5 min.and transfer supernatant to fresh tube. Repeat choloform:isoamylalcoholextraction until interface is clear. Transfer supernatant to fresh tubeand add 1/10 vol 3M NaOAc, pH 5 and 600 μl isopropanol. Keep at −20 for20 min or longer. Pellet RNA by 10 min centrifugation. Wash pellet oncewith 70% ethanol. Remove all remaining alcohol before resolving pelletwith 15 to 20 μl DEPC-water. Determine quantity and quality by measuringthe absorbance of a 1:200 dilution at 260 and 280 nm. 40 μgRNA/ml=1OD₂₆₀

RNA from roots of wild-type and the pickle mutant of Arabidopsis wasisolated as described (Ogas et al. 1997, Science 277:91-94; Ogas et al.1999, Proc. Natl. Acad. Sci. USA 96:13839-13844).

The mRNA was prepared from total RNA, using the Amersham PharmaciaBiotech mRNA purification kit, which utilizes oligo(dT)-cellulosecolumns.

Isolation of Poly-(A)+ RNA was isolated using Dyna Beads® (Dynal, Oslo,Norway) following the instructions of the manufacturer's protocol. Afterdetermination of the concentration of the RNA or of the poly(A)+ RNA,the RNA was precipitated by addition of 1/10 volumes of 3 M sodiumacetate pH 4.6 and 2 volumes of ethanol and stored at −70° C.

Example 4 cDNA Library Construction

For cDNA library construction, first strand synthesis was achieved usingMurine Leukemia Virus reverse transcriptase (Roche, Mannheim, Germany)and oligo-d(T)-primers, second strand synthesis by incubation with DNApolymerase I, Klenow enzyme and RNAseH digestion at 12° C. (2 h), 16° C.(1 h) and 22° C. (1 h). The reaction was stopped by incubation at 65° C.(10 min) and subsequently transferred to ice. Double stranded DNAmolecules were blunted by T4-DNA-polymerase (Roche, Mannheim) at 37° C.(30 min). Nucleotides were removed by phenol/chloroform extraction andSephadex G50 spin columns. EcoRI adapters (Pharmacia, Freiburg, Germany)were ligated to the cDNA ends by T4-DNA-ligase (Roche, 12° C.,overnight) and phosphorylated by incubation with polynucleotide kinase(Roche, 37° C., 30 min). This mixture was subjected to separation on alow melting agarose gel. DNA molecules larger than 300 base pairs wereeluted from the gel, phenol extracted, concentrated on Elutip-D-columns(Schleicher and Schuell, Dassel, Germany) and were ligated to vectorarms and packed into lambda ZAPII phages or lambda ZAP-Express phagesusing the Gigapack Gold Kit (Stratagene, Amsterdam, Netherlands) usingmaterial and following the instructions of the manufacturer.

Example 5 Identification of LMP Genes of Interest

The pickle Arabidopsis mutant was used to identify LMP-encoding genes.The pickle mutant accumulates seed storage compounds, such as seedstorage lipids and seed storage proteins, in the root tips (Ogas et al.1997, Science 277:91-94; Ogas et al. 1999, Proc. Natl. Acad. Sci. USA96:13839-13844). mRNA isolated from roots of wild-type and pickle plantswas used to create a subtracted and normalized cDNA library (SSHlibrary, see example 4) containing cDNAs that are only present in thepickle roots, but not in the wild-type roots. Clones from the SSHlibrary were spotted onto nylon membranes and hybridized withradio-labeled pickle or wild-type root mRNA to ascertain that the SSHclones were more abundant in pickle roots compared to wild-type roots.These SSH clones were randomly sequenced and the sequences wereannotated using the annotation program PedantPro (see example 11). Basedon the expression levels and on these initial functional annotations(see Table 3), clones from the SSH library were identified as potentialLMP-encoding genes.

To identify additional potential gene targets from the Arabidopsispickle mutant, the MPSS RNA expression profiling technology of LynxTherapeutics Inc. was used (Brenner et al. 2000 Nature Biotechnology18:630-634. Gene expression analysis by massively parallel signaturesequencing (MPSS) on microbead arrays). The MPSS technology enables thequantitation of the abundance of mRNA transcripts in mRNA samples andwas used to obtain expression profiles of wild-type and pickle rootmRNAs. RNA was harvested from roots of 10 day old wild-type and pklmutant seedlings that were grown on a defined medium on Petri plates.Candidate genes were selected based on the significant upregulation oftheir expression levels in pickle roots compared to wild-type roots.Since the pickle root exhibits various embryonic phenotypes such as theaccumulation of seed storage lipids and proteins the upregulation ofgenes in the pickle root implied that the same genes could playimportant roles in the regulation of seed development and theaccumulation of seed storage compounds in developing seeds.

TABLE 3 Putative LMP Functions SEQ ID Sequence code Function ORFposition NO: AT004002024 Per1 gene; peroxiredoxin 1-648 1 AT004004054Enoyl-CoA hydratase 1-720 3 AT004005069 similarity to 1-1995 5phosphatidylcholine-sterol O-acyltransferase (EC 2.3.1.43) precursorAT004009021 Plastidic dihydroxyacetone 1-1200 7 3-phosphate reductasePk109 putative protein 1-1173 9 Pk109-1 Putative protein 1-843 11 Pk110phosphoenolpyruvate 1-2013 13 carboxykinase (ATP)- like protein Pk111hypothetical protein 1-2337 15 Pk111-1 hypothetical protein 1-1667 17Pk113 putative isocitrate lyase 1-1719 19 Pk114 unknown protein 1-894 21Pk116 acyl carrier protein 1 1-411 23 precursor (ACP) Pk117 inorganic1-900 25 pyrophosphatase - like protein Pk118 sucrose synthase 1-2415 27Pk118-1 sucrose synthase 1-2391 29 Pk120 pyruvate kinase 1-1530 31

TABLE 4 Grouping of LMPs based on Functional protein domains FunctionalSEQ Domain category ID: SEQ Code: Functional domain position Antioxidant1 AT004002024 AhpC-TSA familyAlkyl Hydroperoxide  6-152 peroxidasesAhpC, Thiol-specific antioxidant protein TSA Beta-oxidation 3AT004004054 Enoyl-CoA hydratase/isomerase  13-181 Acyltransferase 5AT004005069 lecithin:cholesterol acyltransferase (LCAT) 103-627Dehydrogenase 7 AT004009021 NAD-dependent glycerol-3-phosphate  54-396dehydrogenase Dehydrogenase 25 Pk117 UDP-glucose/GDP-mannose 36-45dehydrogenase ATP synthase 21 Pk114 ATP synthase (E/31 kDa) subunit74-95 Kinase 13 Pk110 Phosphoenolpyruvate carboxykinase 147-618 Kinase 9Pk109 Cytidilate kinase 200-234 Kinase 11 Pk109-1 Cytidilate kinase165-199 Kinase 23 Pk116 Polyphospate kinase  76-126 Kinase 31 Pk120Pyruvate kinase  16-365 Isocitrate lyase 19 Pk113 Isocitrate lyase 18-548 Sucrose synthase 27 Pk118 Sucrose synthase  11-551 Sucrosesynthase 29 Pk118-1 Sucrose synthase  2-543 Membrane- 15 Pk111Ca-activated BK potassium channel α 315-330 associated subunit Membrane-15 Pk111 Anion-transporting ATPase 402-413 associated Membrane- 17Pk111-1 Ca-activated BK potassium channel α  93-108 associated subunitMembrane- 17 Pk111-1 Anion-transporting ATPase 180-191 associatedClassification of the proteins was done by Blasting against the BLOCKSdatabase (S. Henikoff & J. G. Henikoff, “Protein family classificationbased on searching a database of blocks”, Genomics 19:97-107 (1994)).

Example 6

Cloning of Full-Length cDNAs and Binary Plasmids for PlantTransformation

RACE-PCR to Determine Full-Length Sequences

Full-length sequences of the Arabidopsis thaliana cDNAs that wereidentified in the SSH library and by MPSS RNA expression profiling wereisolated by RACE PCR using the SMART RACE cDNA amplification kit fromClontech allowing both 5′- and 3′ rapid amplification of cDNA ends(RACE). The isolation of first-strand cDNAs and the RACE PCR protocolused were based on the manufacturer's conditions. The RACE productfragments were extracted from agarose gels with a QIAquick® GelExtraction Kit (Qiagen) and ligated into the TOPO® pCR 2.1 vector(Invitrogen) following manufacturer's instructions. Recombinant vectorswere transformed into TOP10 cells (Invitrogen) using standard conditions(Sambrook et al. 1989). Transformed cells were grown overnight at 37° C.on LB agar containing 50 μg/ml kanamycin and spread with 40 μl of a 40mg/ml stock solution of X-gal in dimethylformamide for blue-whiteselection. Single white colonies were selected and used to inoculate 3ml of liquid LB containing 50 μg/ml kanamycin and grown overnight at 37°C. Plasmid DNA was extracted using the QIAprep® Spin Miniprep Kit(Qiagen) following manufacturer's instructions. Subsequent analyses ofclones and restriction mapping was performed according to standardmolecular biology techniques (Sambrook et al. 1989). The sequencesobtained from the RACE reactions were compiled to give the nucleotidesequences for the LMP genes (SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, 15, 17,19, 21, 23, 25, 27, 29 and 31).

RT-PCR and Cloning of Arabidopsis thaliana LMP Genes

Full-length LMP cDNAs were isolated by RT-PCR from Arabidopsis thalianaRNA. The synthesis of the first strand cDNA was achieved using AMVReverse Transcriptase (Roche, Mannheim, Germany). The resultingsingle-stranded cDNA was amplified via Polymerase Chain Reaction (PCR)utilizing two gene-specific primers. The conditions for the reactionwere standard conditions with Expand High Fidelity PCR system (Roche).The parameters for the reaction were: five minutes at 94° C. followed byfive cycles of 40 seconds at 94° C., 40 seconds at 50° C. and 1.5minutes at 72° C. This was followed by thirty cycles of 40 seconds at94° C., 40 seconds at 65° C. and 1.5 minutes at 72° C. The fragmentsgenerated under these RT-PCR conditions were analyzed by agarose gelelectrophoresis to make sure that PCR products of the expected lengthhad been obtained.

Full-length LMP cDNA were isolated by using synthetic oligonucleotideprimers (MWG-Biotech) designed based on the LMP gene specific DNAsequence that was determined by EST sequencing and by sequencing of RACEPCR products. For SEQ ID NO:1, SEQ ID NO:5, and SEQ ID NO:7, 5′ PCRprimers contained a Not1 restriction site 5′ upstream of the ATG startcodon and a Not1 restriction site 3′ downstream of the stop codon. Inthe case of SEQ ID NO:3 PCR primers contained a BamHI and a Xba1restriction site, respectively. All other 5′ PCR primers (“forwardprimer”, F) contained an AscI restriction site 5′ upstream of the ATGstart codon. All 3′ PCR primers (“reverse primers”, R) contained a PacIrestriction site 3′ downstream of the stop codon. The restriction siteswere added so that the RT-PCR amplification products could be clonedinto the AscI and PacI restriction sites located in the multiple cloningsite of the binary vector pBPS-GB1. The first 2 nucleotides arc used asspacers so the restriction enzymes cut properly. The following “forward”(F) and “reverse” (R) primers were used to amplify the full-lengthArabidopsis thaliana cDNAs by RT-PCR using RNA from Arabidopsis thalianaas original template:

For amplification of SEQ ID NO: 1 AT004002024F (SEQ ID NO: 33)(5′-ATAAGAATGCGGCCGCATGCCAGGGATCACACTAG-3′) AT004002024R (SEQ ID NO: 34)(5′-ATAAGAATGCGGCCGCTCAAGAGACCTCTGTGTGA-3′) For amplification of SEQ IDNO: 3 AT004004054F (SEQ ID NO: 35) (5′-GCGGATCCAGAGAAATGTGTTCATTAGAG-3′)AT004004054R (SEQ ID NO: 36) (5′-CGTCTAGAGTTCTAAAGTTTAGATCCAGT-3′) Foramplification of SEQ ID NO: 5 AT004005069F (SEQ ID NO: 37)(5′-ATAAGAATGCGGCCGCATGTCTCCACTTCTCCGGTTTAG-3′ AT004005069R (SEQ ID NO:38) (5′-ATAAGAATGCGGCCGCTCACAACTTGATGCTAATTC-3′) For amplification ofSEQ ID NO: 7 AT004009021F (SEQ ID NO: 39)(5′-ATAAGAATGCGGCCGCATGCGCTTCCGATCATTCTTCTTCTCCTCC TCTATC-3′AT004009021R (SEQ ID NO: 40) (3′-ATAAGAATGCGGCCGCTTATAGTTTGTTCTCGCGG-5′)For amplification of SEQ ID NO: 9 Pk109F (SEQ ID NO: 41)(5′-ATGGCGCGCCATGTTGCCCAGATTAGCTCGAGTCG-3′) pk109R (SEQ ID NO: 42)(5′-GCTTAATTAACTAACAGCTAGCACATTCCCTTGTG-3′) For amplification of SEQ IDNO: 11 Pk109F (SEQ ID NO: 43)(5′-ATGGCGCGCCATGTTGCCCAGATTAGCTCGAGTCG-3′) pk109R (SEQ ID NO: 44)(5′-GCTTAATTAACTAACAGCTAGCACATTCCCTTGTG-3′) For amplification of SEQ IDNO: 13 Pk110F (SEQ ID NO: 45)(5′-ATGGCGCGCCATGTCGGCCGGTAACGGAAATGCTAC-3′) pk110R (SEQ ID NO: 46)(5′-GCTTAATTAACTAAAAGATAGGACCAGCAGCGAG-3′) For amplification of SEQ IDNO: 15 Pk111F (SEQ ID NO: 47) (5′-ATGGCGCGCCATGGTTTCGTTTACGGGTTTCGC-3′)pk111R (SEQ ID NO: 48) (5′-GCTTAATTAATCAAGGTCCTCTCATCTTTTCAACA-3′) Foramplification of SEQ ID NO: 17 Pk111F (SEQ ID NO: 49)(5′-ATGGCGCGCCATGGTTTCGTTTACGGGTTTCGC-3′) pk111R (SEQ ID NO: 50)(5′-GCTTAATTAATCAAGGTCCTCTCATCTTTTCAACA-3′) For amplification of SEQ IDNO: 19 Pk113F (SEQ ID NO: 51)(5′-ATGGCGCGCCAAGACTAACATGGAAATTGATGGCCG-3′) pk113R (SEQ ID NO: 52)(5′-GCTTAATTAACTTCTACCGGGTTTTTTCACTACG-3′) For amplification of SEQ IDNO: 21 Pk114F (SEQ ID NO: 53)(5′-ATGGCGCGCCATGGGGTTAGAGAGGAAAGTGTACGG-3′) pk114R (SEQ ID NO: 54)(5′-GCTTAATTAATCAGAGCTCAGCATCATCGTCGGT-3′) For amplification of SEQ IDNO: 23 Pk116F (SEQ ID NO: 55) (5′-ATGGCGCGCCATGGCGACTCAATTCAGCGCTTC-3′)pk116R (SEQ ID NO: 56) (5′-GCTTAATTAATTACTTCTTCTCGTTGATGAGCTCTTC-3′) Foramplification of SEQ ID NO: 25 Pk117F (SEQ ID NO: 57)(5′-ATGGCGCGCCATGGCGGCTACTAGAGTGTTAACTG-3′) pk117R (SEQ ID NO: 58)(5′-GCTTAATTAATCAGTAAAGTGAAAGGTCTCCAGCA-3′) For amplification of SEQ IDNO: 27 Pk118F (SEQ ID NO: 59)(5′-ATGGCGCGCCAACAATGGCGTCTTTCTTTGATCTCG-3′) pk118R (SEQ ID NO: 60)(5′-GCTTAATTAATCAGTTCTCATCTGTTGCCAG-3′) For amplification of SEQ ID NO:29 Pk118F (SEQ ID NO: 61) (5′-ATGGCGCGCCAACAATGGCGTCTTTCTTTGATCTCG-3′)pk118R (SEQ ID NO: 62) (5′-GCTTAATTAATCAGTTCTCATCTGTTGCCAG-3′) Foramplification of SEQ ID NO: 31 Pk120F (SEQ ID NO: 63)(5′-ATGGCGCGCCATGTCGAACATAGACATAGAAGGGATC-3′) pk120R (SEQ ID NO: 64)(5′-GCTTAATTAATCACTTAACCACACAGATCTTAATAACTG-3′)

Example 7 Agrobacterium Mediated Plant Transformation

For plant transformation, binary vectors such as pBinAR can be used(Höfgen & Willmitzer 1990, Plant Sci. 66: 221-230). Plant binary vectorsencoding LMP genes were constructed with the aim to achieve theoverexpression of functionally active proteins in transgenic plants. AllLMP gene candidates were cloned into the plant binary vector pBPS-GB1vector. The binary vector contains a selectable marker gene driven underthe control of the AtAct2-I promoter (Ann Y-Q et al., 1996, PlantJournal 10:107-121) and a USP (unknown seed protein, Bäumlein et al.,Mol Gen Genet 225: 459-467, 1991) seed-specific promoter driving thecandidate LMP gene with the NOSpA terminator. Full-length LMP cDNAs werecloned into AscI and PacI restriction sites in the multiple cloning siteof pBPS-GB1 in sense orientation behind the USP seed-specific promoter.The recombinant binary vectors (based on pBPS-GB1) containing the genesof interest were transformed into E. coli Top10 cells (Invitrogen) usingstandard conditions. Transformed cells were selected for on LB agarcontaining an antibiotic and grown overnight at 37° C. Plasmid DNA wasextracted using the QIAprep Spin Miniprep Kit (Qiagen) followingmanufacturer's instructions. Analysis of subsequent clones andrestriction mapping was performed according to standard molecularbiology techniques (Sambrook et al. 1989, Molecular Cloning, ALaboratory Manual. 2^(nd) Edition. Cold Spring Harbor Laboratory Press.Cold Spring Harbor, N.Y.). The nucleotide sequence of the inserted LMPgenes was verified by “2+1” sequencing (the insert DNA was sequence bydetermining the nucleotide sequence of one DNA stand with twoindependent sequence reactions and the complementary DNA strand with onsequencing reaction according to the Bermuda convention). The fulllength sequences are shown as SEQ ID NOs: 1, 3, 5, 7, 9, 11, 13, 15, 17,19, 21, 23, 25, 27, 29 and 31.

Agrobacterium mediated plant transformation with binary vectors encodingthe LMP nucleic acids described herein was performed using standardtransformation and regeneration techniques (Gelvin, Stanton B. &Schilperoort R. A, Plant Molecular Biology Manual, 2nd ed. KluwerAcademic Publ., Dordrecht 1995 in Sect., Ringbuc Zentrale Signatur:BT11-P; Glick, Bernard R. and Thompson, John E. Methods in PlantMolecular Biology and Biotechnology, S. 360, CRC Press, Boca Raton1993).

The Agrobacterium mediated transformation of Arabidopsis thaliana wasperformed using the GV3 (pMP90) (Koncz & Schell, 1986, Mol. Gen. Genet.204: 383-396) Agrobacterium tumefaciens strain. Arabidopsis thalianaecotype Col-2 was grown and transformed according to standard conditions(Bechtold 1993, Acad. Sci. Paris. 316: 1194-1199; Bent et al. 1994,Science 265: 1856-1860). Kanamycin was used as antibiotic selectionmarker for Agrobacterium transformation. The presence and correctorientation of the LMP-encoding binary vectors in Agrobacterium cultureswas verified by PCR using the LMP gene-specific primers described inexample 6. For the plant transformation flowering Arabidopsis plantswere dipped into the recombinant Agrobacterium cultures and allowed togo to seed. Transgenic Arabidopsis T1 plants were identified by growingthe seeds on Petri plates containing the selection agent appropriate forthe selection marker present on the T-DNA. Surviving healthy seedlingswere transferred to soil and grown in a growth chamber under controlledconditions. T2 seeds were harvested from these T1 plants. The transgeniclines were propagated through successive generations and T2, T3 and T4seeds were obtained. The segregation ratio of the presence or absence ofthe T-DNA was monitored in order to determine whether the linescontained single-locus or multi-locus insertions and whether the lineswere homozygous or heterozygous for the T-DNA insertion. T2, T3 and T4seeds were analyzed for seed oil content (see also example 8).

Agrobacterium mediated plant transformation is also applicable toBrassica and other crops. In particular, seeds of canola are surfacesterilized with 70% ethanol for 4 minutes at room temperature withcontinuous shaking, followed by 20% (v/v) Clorox supplemented with 0.05%(v/v) Tween for 20 minutes, at room temperature with continuous shaking.Then, the seeds are rinsed 4 times with distilled water and placed onmoistened sterile filter paper in a Petri dish at room temperature for18 hours. The seed coats are removed and the seeds are air driedovernight in a half-open sterile Petri dish. During this period, theseeds lose approximately 85% of their water content. The seeds are thenstored at room temperature in a sealed Petri dish until further use.

Agrobacterium tumefaciens culture is prepared from a single colony in LBsolid medium plus appropriate antibiotics (e.g. 100 mg/l streptomycin,50 mg/l kanamycin) followed by growth of the single colony in liquid LBmedium to an optical density at 600 nm of 0.8. Then, the bacteriaculture is pelleted at 7000 rpm for 7 minutes at room temperature, andre-suspended in MS (Murashige & Skoog 1962, Physiol. Plant. 15: 473-497)medium supplemented with 100 μM acetosyringone. Bacteria cultures areincubated in this pre-induction medium for 2 hours at room temperaturebefore use. The axis of soybean zygotic seed embryos at approximately44% moisture content are imbibed for 2 hours at room temperature withthe pre-induced Agrobacterium suspension culture. (The imbibition of dryembryos with a culture of Agrobacterium is also applicable to maizeembryo axes). The embryos are removed from the imbibition culture andarc transferred to Petri dishes containing solid MS medium supplementedwith 2% sucrose and incubated for 2 days, in the dark at roomtemperature. Alternatively, the embryos are placed on top of moistened(liquid MS medium) sterile filter paper in a Petri dish and incubatedunder the same conditions described above. After this period, theembryos are transferred to either solid or liquid MS medium supplementedwith 500 mg/l carbenicillin or 300 mg/l cefotaxime to kill theagrobacteria. The liquid medium is used to moisten the sterile filterpaper. The embryos are incubated during 4 weeks at 25° C., under 440μmol m⁻² sec⁻¹ and 12 hours photoperiod. Once the seedlings haveproduced roots, they are transferred to sterile metromix soil. Themedium of the in vitro plants is washed off before transferring theplants to soil. The plants are kept under a plastic cover for 1 week tofavor the acclimatization process. Then the plants are transferred to agrowth room where they are incubated at 25° C., under 440 μmol m⁻² s⁻¹light intensity and 12 h photoperiod for about 80 days.

Samples of the primary transgenic plants (T₀) are analyzed by PCR toconfirm the presence of T-DNA. These results are confirmed by Southernhybridization wherein DNA is electrophoresed on a 1% agarose gel andtransferred to a positively charged nylon membrane (Roche Diagnostics).The PCR DIG Probe Synthesis Kit (Roche Diagnostics) is used to prepare adigoxigenin-labeled probe by PCR, and used as recommended by themanufacturer.

Transformation of soybean can be performed using for example a techniquedescribed in EP 424 047, U.S. Pat. No. 5,322,783 (Pioneer Hi-BredInternational) or in EP 0397 687, U.S. Pat. No. 5,376,543 or U.S. Pat.No. 5,169,770 (University Toledo).

Example 8

Analysis of the Impact of Recombinant LMPs on the Production of aDesired Seed Storage Compound: Fatty Acid production

The total fatty acid content of Arabidopsis seeds was determined bysaponification of seeds in 0.5 M KOH in methanol at 80° C. for 2 hfollowed by LC-MS analysis of the free fatty acids. Total fatty acidcontent of seeds of control and transgenic plants was measured withbulked seeds (usually 5 mg seed weight) of a single plant. Threedifferent types of controls have been used: Col 2 and Col 0 (Columbia 2and Columbia-0, the Arabidopsis ecotypes LMP gene of interest have beentransformed in), C-24 (an Arabidopsis ecotype found to accumulate highamounts of total fatty acids in seeds) and GB1 (BPS empty, without LMPgene of interest, binary vector construct). The controls indicated inthe tables below have been grown side by side with the transgenic lines.Differences in the total values of the controls are explained either bydifferences in the growth conditions, which were found to be verysensitive to small variations in the plant cultivation, or bydifferences in the standards added to quantity the fatty acid content.Because of the seed bulking all values obtained with T2 seeds and inpart also with T3 seeds are the result of a mixture of homozygous (forthe gene of interest) and heterozygous events, implying that these dataunderestimate the LMP gene effect.

TABLE 5 Determination of the T2 seed total fatty acid content oftransgenic lines of AT004002024 (containing SEQ ID NO: 1). Shown are themeans (±standard deviation). (Average mean values are shown ± standarddeviation, number of individual measurements per plant line: 12-18;Col-0 is the Arabidopsis ecotype the LMP gene has been transformed in)Genotype g total fatty acids/g seed weight Pks002-1 transgenic seeds0.321 ± 0.009 Pks002-7 transgenic seeds 0.330 ± 0.005 Pks002-8transgenic seeds 0.300 ± 0.029 Pks002-10 transgenic seeds 0.363 ± 0.013Pks002-16 transgenic seeds 0.278 ± 0.013 Col-0 wild-type seeds 0.247 ±0.008

TABLE 6 Determination of the T3 seed total fatty acid content oftransgenic lines of AT004004054 (containing SEQ ID NO: 3). Shown are themeans (±standard deviation) of individual plants (number inparenthesis). Genotype g total fatty acids/g seed weight Pks001-22 (2-7)transgenic seeds 0.379 ± 0.038 Pks001-27 (2, 5, 8-9) transgenic seeds0.371 ± 0.053 Pks001-39 (1, 2, 4-7) transgenic seeds 0.333 ± 0.038Pks001-89 (1-7) transgenic seeds 0.295 ± 0.036 Pks001-106 (1-3, 5-7)transgenic seeds 0.261 ± 0.040 Col-0 wild-type seeds (1-8) 0.284 ± 0.032

TABLE 7 Determination of the T2 seed total fatty acid content oftransgenic lines of AT004005069 (containing SEQ ID NO: 5 in antisenseorientation). Shown are the means (±standard deviation) of 18 individualplants, respectively. Genotype g total fatty acids/g seed weightPks004-1 transgenic seeds 0.532 ± 0.014 Pks004-3 transgenic seeds 0.488± 0.013 Pks004-4 transgenic seeds 0.492 ± 0.016 Pks004-18 transgenicseeds 0.488 ± 0.012 Pks004-20 transgenic seeds 0.461 ± 0.011 Pks004-21transgenic seeds 0.421 ± 0.035 Col-0 wild-type seeds 0.301 ± 0.026

TABLE 8 Determination of the T2 seed total fatty acid content oftransgenic lines of AT004009021 (containing SEQ ID NO: 7). Shown are themeans (±standard deviation) of 10 individual plants. Genotype g totalfatty acids/g seed weight Pks003-3 transgenic seeds 0.353 ± 0.019Pks003-4 transgenic seeds 0.293 ± 0.046 Pks003-8 transgenic seeds 0.265± 0.073 Pks003-16 transgenic seeds 0.312 ± 0.021 Pks003-17 transgenicseeds 0.311 ± 0.026 Pks003-18 transgenic seeds 0.322 ± 0.023 Col-0wild-type seeds 0.259 ± 0.048

TABLE 9 Determination of the T3 seed total fatty acid content oftransgenic lines of pk109 (containing SEQ ID NO: 11). Shown are themeans (±standard deviation) of 14-20 individual plants per line.Genotype g total fatty acids/g seed weight C-24 wild type seeds 0.393 ±0.047 Col-2 wild type seeds 0.351 ± 0.024 GB-1 empty vector control0.350 ± 0.027 Pk109-11 transgenic seeds 0.377 ± 0.038 Pk109-16transgenic seeds 0.384 ± 0.035 Pk109-12 transgenic seeds 0.384 ± 0.046

TABLE 10 Determination of the T2 seed total fatty acid content oftransgenic lines of pk110 (containing SEQ ID NO: 13). Shown are themeans (±standard deviation) of 6-10 individual plants per line. Genotypeg total fatty acids/g seed weight C-24 wild type seeds 0.405 ± 0.029Col-2 wild type seeds 0.390 ± 0.019 Pk110 (7, 9, 10, 11, 13, 19) 0.409 ±0.016 transgenic seeds

TABLE 11 Determination of the T2 seed total fatty acid content oftransgenic lines of pk111-1 (containing SEQ ID NO: 17). Shown are themeans (±standard deviation) of 8-10 individual plants per line. Genotypeg total fatty acids/g seed weight C-24 wild type seeds 0.405 ± 0.029Col-2 wild type seeds 0.390 ± 0.019 Pk111 (1, 2, 4, 5, 6, 8, 14, 17, 19)0.417 ± 0.015 transgenic seeds

TABLE 12 Determination of the T3 seed total fatty acid content oftransgenic lines of pk114 (containing SEQ ID NO: 21). Shown are themeans (±standard deviation) of 12-19 individual plants per line.Genotype g total fatty acids/g seed weight C-24 wild-type seeds 0.435 ±0.027 Col-2 wild type seeds 0.398 ± 0.021 GB-1 empty vector control0.412 ± 0.027 Pk114-16 transgenic seeds 0.430 ± 0.036 Pk114-19transgenic seeds 0.419 ± 0.034 Pk114-19 transgenic seeds 0.438 ± 0.028

TABLE 13 Determination of the T2 seed total fatty acid content oftransgenic lines of pk116 (containing SEQ ID NO: 23). Shown are themeans (±standard deviation) of 6-10 individual plants per line. Genotypeg total fatty acids/g seed weight C-24 wild-type seeds 0.405 ± 0.029Col-2 wild-type seeds 0.390 ± 0.019 Pk116 (2, 4, 5, 11, 13, 16) 0.409 ±0.013 transgenic seeds

TABLE 14 Determination of the T3 seed total fatty acid content oftransgenic lines of pk117-1 (containing SEQ ID NO: 25). Shown are themeans (±standard deviation) of bulked seeds (5 mg) of 16-19 individualplants per line. Genotype g total fatty acids/g seed weight C-24 wildtype seeds 0.442 ± 0.022 Col-2 wild type seeds 0.407 ± 0.028 GB-1 emptyvector control 0.403 ± 0.023 Pk117-10 transgenic seeds 0.421 ± 0.011Pk117-3 transgenic seeds 0.424 ± 0.031

TABLE 15 Determination of the T2 seed total fatty acid content oftransgenic lines of pk120 (containing SEQ ID NO: 31). Shown are themeans (±standard deviation) of 6-12 individual plants per line. Genotypeg total fatty acids/g seed weight C-24 wild type seeds 0.408 ± 0.026Col-2 wild type seeds 0.363 ± 0.023 Pk120 (1, 5, 6, 10, 11, 16) 0.397 ±0.010 transgenic seeds

Example 9 Analysis of the Impact of Recombinant Proteins on theProduction of a Desired Seed Storage Compound

The effect of the genetic modification in plants on a desired seedstorage compound (such as a sugar, lipid or fatty acid) can be assessedby growing the modified plant under suitable conditions and analyzingthe seeds or any other plant organ for increased production of thedesired product (i.e., a lipid or a fatty acid). Such analysistechniques are well known to one skilled in the art, and includespectroscopy, thin layer chromatography, staining methods of variouskinds, enzymatic and microbiological methods, and analyticalchromatography such as high performance liquid chromatography (see, forexample, Ullman 1985, Encyclopedia of Industrial Chemistry, vol. A2, pp.89-90 and 443-613, VCH: Weinheim; Fallon, A. et al. 1987, Applicationsof HPLC in Biochemistry in: Laboratory Techniques in Biochemistry andMolecular Biology, vol. 17; Rehm et al., 1993 Product recovery andpurification, Biotechnology, vol. 3, Chapter III, pp. 469-714, VCH:Weinheim; Belter, P. A. et al., 1988 Bioseparations: downstreamprocessing for biotechnology, John Wiley & Sons; Kennedy J. F. & CabralJ. M. S. 1992, Recovery processes for biological materials, John Wileyand Sons; Shaeiwitz J. A. & Henry J. D. 1988, Biochemical separationsin: Ulmann's Encyclopedia of Industrial Chemistry, Separation andpurification techniques in biotechnology, vol. B3, Chapter 11, pp. 1-27,VCH: Weinheim; and Dechow F. J. 1989).

Besides the above-mentioned methods, plant lipids are extracted fromplant material as described by Cahoon et al. (1999, Proc. Natl. Acad.Sci. USA 96, 22:12935-12940) and Browse et al. (1986, Anal. Biochemistry442:141-145). Qualitative and quantitative lipid or fatty acid analysisis described in Christie, William W., Advances in Lipid Methodology.Ayr/Scotland: Oily Press.—(Oily Press Lipid Library; Christie, WilliamW., Gas Chromatography and Lipids. A Practical Guide—Ayr, Scotland: OilyPress, 1989 Repr. 1992.—IX,307 S.—(Oily Press Lipid Library; and“Progress in Lipid Research, Oxford: Pergamon Press, 1 (1952)-16 (1977)Progress in the Chemistry of Fats and Other Lipids CODEN.

Unequivocal proof of the presence of fatty acid products can be obtainedby the analysis of transgenic plants following standard analyticalprocedures: GC, GC-MS or TLC as variously described by Christie andreferences therein (1997 in: Advances on Lipid Methodology 4th ed.:Christie, Oily Press, Dundee, pp. 119-169; 1998). Detailed methods aredescribed for leaves by Lemieux et al. (1990, Theor. Appl. Genet.80:234-240) and for seeds by Focks & Benning (1998, Plant Physiol.118:91-101).

Positional analysis of the fatty acid composition at the C-1, C-2 or C-3positions of the glycerol backbone is determined by lipase digestion(see, e.g., Siebertz & Heinz 1977, Z. Naturforsch. 32c: 193-205, andChristie 1987, Lipid Analysis 2^(nd) Edition, Pergamon Press, Exeter,ISBN 0-08-023791-6).

A typical way to gather information regarding the influence of increasedor decreased protein activities on lipid and sugar biosynthetic pathwaysis for example via analyzing the carbon fluxes by labeling studies withleaves or seeds using ¹⁴C-acetate or ¹⁴C-pyruvate (see, e.g. Focks &Benning 1998, Plant Physiol. 118:91-101, Eccleston & Ohlrogge 1998,Plant Cell 10:613-621). The distribution of carbon-14 into lipids andaqueous soluble components can be determined by liquid scintillationcounting after the respective separation (for example on TLC plates)including standards like ¹⁴C-sucrose and ¹⁴C-malate (Eccleston &Ohlrogge 1998, Plant Cell 10:613-621).

Material to be analyzed can be disintegrated via sonification, glassmilling, liquid nitrogen and grinding or via other applicable methods.The material has to be centrifuged after disintegration. The sediment isre-suspended in distilled water, heated for 10 minutes at 100° C.,cooled on ice and centrifuged again followed by extraction in 0.5 Msulfuric acid in methanol containing 2% dimethoxypropane for 1 hour at90° C. leading to hydrolyzed oil and lipid compounds resulting intransmethylated lipids. These fatty acid methyl esters are extracted inpetrolether and finally subjected to GC analysis using a capillarycolumn (Chrompack, WCOT Fused Silica, CP-Wax-52 CB, 25 m, 0.32 mm) at atemperature gradient between 170° C. and 240° C. for 20 minutes and 5min. at 240° C. The identity of resulting fatty acid methylesters isdefined by the use of standards available form commercial sources (i.e.,Sigma).

In case of fatty acids where standards are not available, moleculeidentity is shown via derivatization and subsequent GC-MS analysis. Forexample, the localization of triple bond fatty acids is shown via GC-MSafter derivatization via 4,4-Dimethoxy-oxazolin-Derivaten (Christie,Oily Press, Dundee, 1998).

A common standard method for analyzing sugars, especially starch, ispublished by Stitt M., Lilley R. Mc. C., Gerhardt R. and Heldt M. W.(1989, “Determination of metabolite levels in specific cells andsubcellular compartments of plant leaves” Methods Enzymol. 174:518-552;for other methods see also Härtel et al. 1998, Plant Physiol. Biochem.36:407-417 and Focks & Benning 1998, Plant Physiol. 118:91-101).

For the extraction of soluble sugars and starch, 50 seeds arehomogenized in 500 μl of 80% (v/v) ethanol in a 1.5-ml polypropylenetest tube and incubated at 70° C. for 90 min Following centrifugation at16,000 g for 5 min, the supernatant is transferred to a new test tube.The pellet is extracted twice with 500 μl of 80% ethanol. The solvent ofthe combined supernatants is evaporated at room temperature under avacuum. The residue is dissolved in 50 μl of water, representing thesoluble carbohydrate fraction. The pellet left from the ethanolextraction, which contains the insoluble carbohydrates including starch,is homogenized in 200 μl of 0.2 N KOH, and the suspension is incubatedat 95° C. for 1 h to dissolve the starch. Following the addition of 35μl of 1 N acetic acid and centrifugation for 5 min at 16,000 g, thesupernatant is used for starch quantification.

To quantify soluble sugars, 10 μl of the sugar extract is added to 990μl of reaction buffer containing 100 mM imidazole, pH 6.9, 5 mM MgCl₂, 2mM NADP, 1 mM ATP, and 2 units 2 ml⁻¹ of Glucose-6-P-dehydrogenase. Forenzymatic determination of glucose, fructose and sucrose, 4.5 units ofhexokinase, 1 unit of phosphoglucoisomerase, and 2 μl of a saturatedfructosidase solution are added in succession. The production of NADPHis photometrically monitored at a wavelength of 340 nm. Similarly,starch is assayed in 30 μl of the insoluble carbohydrate fraction with akit from Boehringer Mannheim.

An example for analyzing the protein content in leaves and seeds can befound by Bradford M. M. (1976, “A rapid and sensitive method for thequantification of microgram quantities of protein using the principle ofprotein dye binding” Anal. Biochem. 72:248-254). For quantification oftotal seed protein, 15-20 seeds are homogenized in 250 μl of acetone ina 1.5-ml polypropylene test tube. Following centrifugation at 16,000 g,the supernatant is discarded and the vacuum-dried pellet is resuspendedin 250 μl of extraction buffer containing 50 mM Tris-HCl, pH 8.0, 250 mMNaCl, 1 mM EDTA, and 1% (w/v) SDS. Following incubation for 2 h at 25°C., the homogenate is centrifuged at 16,000 g for 5 min and 200 ml ofthe supernatant will be used for protein measurements. In the assay,γ-globulin is used for calibration. For protein measurements, Lowry DCprotein assay (Bio-Rad) or Bradford-assay (Bio-Rad) are used.

Enzymatic assays of hexokinase and fructokinase are performedspectrophotometrically according to Renz et al. (1993, Planta190:156-165), of phosphoglucoisomerase, ATP-dependent6-phosphofructokinase, pyrophosphate-dependent 6-phospho-fructokinase,Fructose-1,6-bisphosphate aldolase, triose phosphate isomerase,glyceral-3-P dehydrogenase, phosphoglycerate kinase, phosphoglyceratemutase, enolase and pyruvate kinase are performed according to Burrellet al. (1994, Planta 194:95-101) and of UDP-Glucose-pyrophosphorylaseaccording to Zrenner et al. (1995, Plant J. 7:97-107).

Intermediates of the carbohydrate metabolism, like Glucose-1-phosphate,Glucose-6-phosphate, Fructose-6-phosphate, Phosphoenolpyruvate,Pyruvate, and ATP are measured as described in Härtel et al. (1998,Plant Physiol. Biochem. 36:407-417) and metabolites are measured asdescribed in Jelitto et al. (1992, Planta 188:238-244).

In addition to the measurement of the final seed storage compound (i.e.,lipid, starch or storage protein) it is also possible to analyze othercomponents of the metabolic pathways utilized for the production of adesired seed storage compound, such as intermediates and side-products,to determine the overall efficiency of production of the compound (Fiehnet al. 2000, Nature Biotech. 18:1447-1161).

Example 10 Northern-Hybridization

For RNA hybridization, 20 μg of total RNA or 1 μg of poly-(A)+ RNA isseparated by gel electrophoresis in 1.25% strength agarose gels usingformaldehyde as described in Amasino (1986, Anal. Biochem. 152:304),transferred by capillary attraction using 10×SSC to positively chargednylon membranes (Hybond N+, Amersham, Braunschweig), immobilized by UVlight and pre-hybridized for 3 hours at 68° C. using hybridizationbuffer (10% dextran sulfate w/v, 1 M NaCl, 1% SDS, 100 μg/ml of herringsperm DNA). The labeling of the DNA probe with the Highprime DNAlabeling kit (Roche, Mannheim, Germany) is carried out during thepre-hybridization using alpha-³²P dCTP (Amersham, Braunschweig,Germany). Hybridization is carried out after addition of the labeled DNAprobe in the same buffer at 68° C. overnight. The washing steps arecarried out twice for 15 min using 2×SSC and twice for 30 min using1×SSC, 1% SDS at 68° C. The exposure of the sealed filters is carriedout at −70° C. for a period of 1 day to 14 days.

Example 11 DNA Sequencing and Computational Functional Analysis of SSHLibrary

The SSH cDNA library as described in Examples 4 and 5 was used for DNAsequencing according to standard methods, in particular by the chaintermination method using the ABI PRISM Big Dye Terminator CycleSequencing Ready Reaction Kit (Perkin-Elmer, Weiterstadt, Germany).Random sequencing was carried out subsequent to preparative plasmidrecovery from cDNA libraries via in vivo mass excision,retransformation, and subsequent plating of DH10B on agar plates(material and protocol details from Stratagene, Amsterdam, Netherlands).Plasmid DNA was prepared from overnight grown E. coli cultures grown inLuria-Broth medium containing ampicillin (see Sambrook et al. (1989,Cold Spring Harbor Laboratory Press: ISBN 0-87969-309-6) on a QiageneDNA preparation robot (Qiagen, Hilden) according to the manufacturer'sprotocols). Sequencing primers with the following nucleotide sequenceswere used:

5′-CAGGAAACAGCTATGACC-3′ (SEQ ID NO: 65) 5′-CTAAAGGGAACAAAAGCTG-3′ (SEQID NO: 66) 5′-TGTTAAAACGACGGCCAGT-3′ (SEQ ID NO: 67)

Sequences were processed and annotated using the software packageEST-MAX commercially provided by Bio-Max (Munich, Germany). The programincorporates practically all bioinformatics methods important forfunctional and structural characterization of protein sequences. Forreference see the pedant.mips.biochem.mpg.de web page.

The most important algorithms incorporated in EST-MAX are: FASTA: Verysensitive protein sequence database searches with estimates ofstatistical significance (Pearson W. R. 1990, Rapid and sensitivesequence comparison with FASTP and FASTA. Methods Enzymol. 183:63-98).BLAST: Very sensitive protein sequence database searches with estimatesof statistical significance (Altschul S. F., Gish W., Miller W., MyersE. W. and Lipman D. J. Basic local alignment search tool. J. Mol. Biol.215:403-410). PREDATOR: High-accuracy secondary structure predictionfrom single and multiple sequences. (Frishman & Argos 1997, 75% accuracyin protein secondary structure prediction. Proteins 27:329-335).CLUSTALW: Multiple sequence alignment (Thompson, J. D., Higgins, D. G.and Gibson, T. J. 1994, CLUSTAL W: improving the sensitivity ofprogressive multiple sequence alignment through sequence weighting,positions-specific gap penalties and weight matrix choice, Nucleic AcidsRes. 22:4673-4680). TMAP: Transmembrane region prediction from multiplyaligned sequences (Persson B. & Argos P. 1994, Prediction oftransmembrane segments in proteins utilizing multiple sequencealignments, J. Mol. Biol. 237:182-192). ALOM2: Transmembrane regionprediction from single sequences (Klein P., Kanehisa M., and DeLisi C.1984, Prediction of protein function from sequence properties: Adiscriminant analysis of a database. Biochim. Biophys. Acta 787:221-226.Version 2 by Dr. K. Nakai). PROSEARCH: Detection of PROSITE proteinsequence patterns. Kolakowski L. F. Jr., Leunissen J. A. M. and Smith J.E. 1992, ProSearch: fast searching of protein sequences with regularexpression patterns related to protein structure and function.Biotechniques 13:919-921). BLIMPS: Similarity searches against adatabase of ungapped blocks (Wallace & Henikoff 1992, PATMAT: Asearching and extraction program for sequence, pattern and block queriesand databases, CABIOS 8:249-254. Written by Bill Alford).

Example 12 In vivo Mutagenesis

In vivo mutagenesis of microorganisms can be performed by incorporationand passage of the plasmid (or other vector) DNA through E. coli orother microorganisms (e.g. Bacillus spp. or yeasts such as Saccharomycescerevisiae) which are impaired in their capabilities to maintain theintegrity of their genetic information. Typical mutator strains havemutations in the genes for the DNA repair system (e.g., mutHLS, mutD,mutT, etc.; for reference, see Rupp W. D. 1996, DNA repair mechanisms,in: Escherichia coli and Salmonella, p. 2277-2294, ASM: Washington.)Such strains are well known to those skilled in the art. The use of suchstrains is illustrated, for example, in Greener and Callahan 1994,Strategies 7:32-34. Transfer of mutated DNA molecules into plants ispreferably done after selection and testing in microorganisms.Transgenic plants are generated according to various examples within theexemplification of this document.

Example 13 Assessment of the mRNA Expression and Activity of aRecombinant Gene Product in the Transformed Organism

The activity of a recombinant gene product in the transformed hostorganism can be measured on the transcriptional or/and on thetranslational level. A useful method to ascertain the level oftranscription of the gene (an indicator of the amount of mRNA availablefor translation to the gene product) is to perform a Northern blot (forreference see, for example, Ausubel et al. 1988, Current Protocols inMolecular Biology, Wiley: New York), in which a primer designed to bindto the gene of interest is labeled with a detectable tag (usuallyradioactive or chemiluminescent), such that when the total RNA of aculture of the organism is extracted, run on gel, transferred to astable matrix and incubated with this probe, the binding and quantity ofbinding of the probe indicates the presence and also the quantity ofmRNA for this gene. This information at least partially demonstrates thedegree of transcription of the transformed gene. Total cellular RNA canbe prepared from plant cells, tissues or organs by several methods, allwell-known in the art, such as that described in Bormann et al. (1992,Mol. Microbiol. 6:317-326).

To assess the presence or relative quantity of protein translated fromthis mRNA, standard techniques, such as a Western blot, may be employed(see, for example, Ausubel et al. 1988, Current Protocols in MolecularBiology, Wiley: New York). In this process, total cellular proteins areextracted, separated by gel electrophoresis, transferred to a matrixsuch as nitrocellulose, and incubated with a probe, such as an antibody,which specifically binds to the desired protein. This probe is generallytagged with a chemiluminescent or colorimetric label which may bereadily detected. The presence and quantity of label observed indicatesthe presence and quantity of the desired mutant protein present in thecell.

The activity of LMPs that bind to DNA can be measured by severalwell-established methods, such as DNA band-shift assays (also called gelretardation assays). The effect of such LMP on the expression of othermolecules can be measured using reporter gene assays (such as thatdescribed in Kolmar H. et al. 1995, EMBO J. 14:3895-3904 and referencescited therein). Reporter gene test systems are well known andestablished for applications in both prokaryotic and eukaryotic cells,using enzymes such as beta-galactosidase, green fluorescent protein, andseveral others.

The determination of activity of lipid metabolism membrane-transportproteins can be performed according to techniques such as thosedescribed in Gennis R. B. (1989 Pores, Channels and Transporters, inBiomembranes, Molecular Structure and Function, Springer: Heidelberg,pp. 85-137, 199-234 and 270-322).

Example 14 In vitro Analysis of the Function of Arabidopsis thalianaGenes in Transgenic Plants

The determination of activities and kinetic parameters of enzymes iswell established in the art. Experiments to determine the activity ofany given altered enzyme must be tailored to the specific activity ofthe wild-type enzyme, which is well within the ability of one skilled inthe art. Overviews about enzymes in general, as well as specific detailsconcerning structure, kinetics, principles, methods, applications andexamples for the determination of many enzyme activities may be found,for example, in the following references: Dixon, M. & Webb, E. C. 1979,Enzymes. Longmans: London; Fersht, (1985) Enzyme Structure andMechanism. Freeman: New York; Walsh (1979) Enzymatic ReactionMechanisms. Freeman: San Francisco; Price, N. C., Stevens, L. (1982)Fundamentals of Enzymology. Oxford Univ. Press: Oxford; Boyer, P. D.,ed. (1983) The Enzymes, 3^(rd) ed. Academic Press: New York; Bisswanger,H., (1994) Enzymkinetik, 2^(nd) ed. VCH: Weinheim (ISBN 3527300325);Bergmeyer, H. U., Bergmeyer, J., Graβl, M., eds. (1983-1986) Methods ofEnzymatic Analysis, 3^(rd) ed., vol. I-XII, Verlag Chemie: Weinheim; andUllmann's Encyclopedia of Industrial Chemistry (1987) vol. A9, Enzymes.VCH: Weinheim, p. 352-363.

Example 15 Purification of the Desired Product from TransformedOrganisms

An IMP can be recovered from plant material by various methods wellknown in the art. Organs of plants can be separated mechanically fromother tissue or organs prior to isolation of the seed storage compoundfrom the plant organ. Following homogenization of the tissue, cellulardebris is removed by centrifugation and the supernatant fractioncontaining the soluble proteins is retained for further purification ofthe desired compound. If the product is secreted from cells grown inculture, then the cells are removed from the culture by low-speedcentrifugation and the supernate fraction is retained for furtherpurification.

The supernatant fraction from either purification method is subjected tochromatography with a suitable resin, in which the desired molecule iseither retained on a chromatography resin while many of the impuritiesin the sample are not, or where the impurities are retained by the resinwhile the sample is not. Such chromatography steps may be repeated asnecessary, using the same or different chromatography resins. Oneskilled in the art would be well-versed in the selection of appropriatechromatography resins and in their most efficacious application for aparticular molecule to be purified. The purified product may beconcentrated by filtration or ultrafiltration, and stored at atemperature at which the stability of the product is maximized.

There are a wide array of purification methods known to the art and thepreceding method of purification is not meant to be limiting. Suchpurification techniques are described, fur example, in Bailey J. E. &Ollis D. F. 1986, Biochemical Engineering Fundamentals, McGraw-Hill: NewYork).

The identity and purity of the isolated compounds may be assessed bytechniques standard in the art. These include high-performance liquidchromatography (HPLC), spectroscopic methods, staining methods, thinlayer chromatography, analytical chromatography such as high performanceliquid chromatography, NIRS, enzymatic assay, or microbiologically. Suchanalysis methods are reviewed in: Patek et al. (1994, Appl. Environ.Microbiol. 60:133-140), Malakhova et al. (1996, Biotekhnologiya11:27-32) and Schmidt et al. (1998, Bioprocess Engineer 19:67-70),Ulmann's Encyclopedia of Industrial Chemistry (1996, Vol. A27, VCH:Weinheim, p. 89-90, p. 521-540, p. 540-547, p. 559-566, 575-581 and p.581-587) and Michal G. (1999, Biochemical Pathways: An Atlas ofBiochemistry and Molecular Biology, John Wiley and Sons; Fallon, A. etal. 1987, Applications of HPLC in Biochemistry in: Laboratory Techniquesin Biochemistry and Molecular Biology, vol. 17).

1. A method of producing a transgenic plant having an increased level oftotal fatty acids in a seed of the transgenic plant relative to a seedof a wild type plant comprising: i) transforming a plant cell with anexpression vector comprising a lipid metabolism protein (LMP) nucleicacid and ii) generating from the plant cell the transgenic plant thatexpresses the LMP nucleic acid and that has an increased level of totalfatty acids in the seed of the transgenic plant, wherein the nucleicacid encodes a polypeptide that comprises an isocitrate lyase domain andwherein the LMP nucleic acid comprises a nucleic acid sequence selectedfrom the group consisting of: a) the isolated nucleic acid as defined inSEQ ID NO: 19; b) an isolated nucleic acid encoding the polypeptide asdefined in SEQ ID NO: 20; c) an isolated nucleic acid that hybridizesunder stringent conditions to the full-length nucleic acid sequence asdefined in SEQ ID NO: 19 or to a nucleic acid encoding the polypeptideas defined in SEQ ID NO: 20, wherein the stringent conditions comprisehybridization in a 6× sodium chloride/sodium citrate (SSC) solution atabout 45° C., followed by at least one wash in a 0.2×SSC, 0.1% sodiumdodecyl sulfate (SDS) solution at 65° C.; d) an isolated nucleic acidhaving at least 95% sequence identity with the isolated full-lengthnucleic acid as defined in SEQ ID NO: 19 or with the nucleic acidencoding the polypeptide as defined in SEQ ID NO: 20; e) a nucleic acidencoding a polypeptide comprising an amino acid sequence having at least95% identity to the sequence of SEQ ID NO: 20; and f) a nucleic acidcomplementary to the nucleic acid of any of a) through e) above.
 2. Themethod of claim 1, wherein the nucleic acid comprises a nucleic acid asdefined in SEQ ID NO:
 19. 3. The method of claim 1, wherein the nucleicacid comprises a nucleic acid having at least 95% sequence identity withthe isolated full-length nucleic acid as defined in SEQ ID NO:
 19. 4.The method of claim 1, wherein the nucleic acid comprises a nucleic acidthat hybridizes under stringent conditions to the nucleic acid asdefined in SEQ ID NO: 19 and wherein the stringent conditions comprisehybridization in a 6× sodium chloride/sodium citrate (SSC) solution atabout 45° C., followed by at least one wash in a 0.2×SSC, 0.1% sodiumdodecyl sulfate (SDS) solution at 65° C.
 5. The method of claim 1,wherein the nucleic acid comprises a nucleic acid encoding thepolypeptide as defined in SEQ ID NO: 20 or a nucleic acid encoding apolypeptide comprising an amino acid sequence having at least 95%identity to the sequence of SEQ ID NO:
 20. 6. The method of claim 1,wherein the nucleic acid comprises a nucleic acid having at least 95%sequence identity with the isolated full-length nucleic acid encodingthe polypeptide as defined in SEQ ID NO:
 20. 7. The method of claim 1,wherein the nucleic acid comprises a nucleic acid that hybridizes understringent conditions to the nucleic acid encoding the polypeptide asdefined in SEQ ID NO: 20 and wherein the stringent conditions comprisehybridization in a 6× sodium chloride/sodium citrate (SSC) solution atabout 45° C., followed by at least one wash in a 0.2×SSC, 0.1% sodiumdodecyl sulfate (SDS) solution at 65° C.
 8. A transgenic plant producedby the method of claim 1, wherein the LMP nucleic acid is expressed inthe plant, and wherein the seed of the plant has an increased level ofthe total fatty acid content as compared to the seed of an untransformedwild type variety of the plant.
 9. The transgenic plant of claim 8,wherein the plant is a dicotyledonous plant.
 10. The transgenic plant ofclaim 8, wherein the plant is a monocotyledonous plant.
 11. Thetransgenic plant of claim 8, wherein the plant is an oil producingspecies.
 12. The transgenic plant of claim 8, wherein the plant isselected from the group consisting of rapeseed, canola, linseed,soybean, sunflower, maize, oat, rye, barley, wheat, sugarbeet, tagetes,cotton, oil palm, coconut palm, flax, castor, and peanut.
 13. A seedproduced by the transgenic plant of claim 8, wherein the plant is truebreeding for an increased level of the total fatty acid content in theseed as compared to a wild type variety of the plant.
 14. A method ofincreasing the level of the total fatty acid content in a seed of aplant comprising, increasing the expression of a lipid metabolismprotein (LMP) nucleic acid in a plant, wherein the nucleic acid encodesa polypeptide that comprises an isocitrate lyase domain, and wherein theLMP nucleic acid comprises a nucleic acid sequence selected from thegroup consisting of: a) the isolated nucleic acid as defined in SEQ IDNO: 19; b) an isolated nucleic acid encoding the polypeptide as definedin SEQ ID NO: 20; c) an isolated nucleic acid that hybridizes understringent conditions to the full-length nucleic acid sequence as definedin SEQ ID NO: 19 or to a nucleic acid encoding the polypeptide asdefined in SEQ ID NO: 20, wherein the stringent conditions comprisehybridization in a 6× sodium chloride/sodium citrate (SSC) solution atabout 45° C., followed by at least one wash in a 0.2×SSC, 0.1% sodiumdodecyl sulfate (SDS) solution at 65° C.; d) an isolated nucleic acidhaving at least 95% sequence identity with the isolated full-lengthnucleic acid as defined in SEQ ID NO: 19 or with the nucleic acidencoding the polypeptide as defined in SEQ ID NO: 20; e) a nucleic acidencoding a polypeptide comprising an amino acid sequence having at least95% identity to the sequence of SEQ ID NO: 20; and f) a nucleic acidcomplementary to the nucleic acid of any of a) through e) above.
 15. Themethod of claim 14, wherein the nucleic acid comprises the nucleic acidas defined in SEQ ID NO:
 19. 16. The method of claim 14, wherein thenucleic acid comprises a nucleic acid having at least 95% sequenceidentity with the isolated full-length nucleic acid as defined in SEQ IDNO:
 19. 17. The method of claim 14, wherein the nucleic acid comprises anucleic acid that hybridizes under stringent conditions to the nucleicacid as defined in SEQ ID NO: 19, and wherein the stringent conditionscomprise hybridization in a 6× sodium chloride/sodium citrate (SSC)solution at about 45° C., followed by at least one wash in a 0.2×SSC,0.1% sodium dodecyl sulfate (SDS) solution at 65° C.
 18. The method ofclaim 14, wherein the nucleic acid comprises the nucleic acid encodingthe polypeptide as defined in SEQ ID NO: 20 or a nucleic acid encoding apolypeptide comprising an amino acid sequence having at least 95%identity to the sequence of SEQ ID NO:
 20. 19. The method of claim 14,wherein the nucleic acid comprises a nucleic acid having at least 95%sequence identity with the isolated full-length nucleic acid encodingthe polypeptide as defined in SEQ ID NO:
 20. 20. The method of claim 14,wherein the nucleic acid comprises a nucleic acid that hybridizes understringent conditions to a nucleic acid encoding the polypeptide asdefined in SEQ ID NO: 20, wherein the stringent conditions comprisehybridization in a 6× sodium chloride/sodium citrate (SSC) solution atabout 45° C., followed by at least one wash in a 0.2×SSC, 0.1% sodiumdodecyl sulfate (SDS) solution at 65° C.
 21. The method of claim 14,wherein the plant is transgenic.
 22. The method of claim 14, wherein theplant is nut transgenic.